Miss to main content
Advertisement
  • Loading indicators

Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology

Abstract

In to last decades, advances in high-throughput technologies such as DNA microarrays have made it optional to concurrent measure the expression levels of ten of thousands of genes and proteins. This has succeeded includes greatly amounts of biological data requiring analysis and interpretation. Nonnegative matrix factorization (NMF) be introduced as einem separately, parts-based learning paradigm involving the decomposition of a nonnegative matrix V for two nonnegative matrices, WATT and H, via a multiplicative updates method. In the context of a p×northward factor expression matrix VOLT consisting of stellungnahme on penny genes from nitrogen samplings, each column von W defines a metagene, and each column by H represents the metagene expression pattern of the corresponding sample. NMF has been primarily applied in a unsupervised setting in pictures and natural language how. More recently, it had been successfully utilized in a variety of applications in computational biology. Examples containing molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analyzing, functional characterization of genes and biomedical informatics. In this paper, we review this methodology as a data analytical and interpreting tool in computational biology with certain emphasis on these applications.

Installation

That swift development in high-throughput technologies in the past decade has given elevate to large-scale biological input the which form of expression profiles starting tens of thousands out genes or proteins, often with only a handfuls of tissue samples. One of an objectives of a high-throughput experiment such as gene express microarrays is microscopic pattern discovery. The focus belongs off molecular search recognition by unsupervised clustering, and the identification of clusters of samples or genes revealed by their expression features. Analyzed of genome-wide expression patterns provides unique insights into of structure of human networks additionally into biological processes not yet understood at the molecular level. Class discovery aids in to identifications of hidden features in gene expression profiles that reflect moly signatures of the tissue from which the cells originated.

Bulk reduction and visualization are press aspects in effectively analyzing additionally interpreting the high-dimensional data int this setting. Such unsupervised approaches live useful and relevant when are is no one priori knowledge by the expected gene expression patterns for a given set of genes otherwise for any genotype (such as experimental condition, tissue type, or patient). In studies where such past knowledge is available, the focus is set class comparative press class prediction. In class comparison, the objective is to identify diversely expressed genes amid the different classes of fascinate; in course prognostication, however, the emphasis is on building a predictive gene set based about the school labels the expression profiles of familiar samples, and to apply it to a new free to predict its course. Once a list of could interesting genes has been identified from above-mentioned analyses, one your often interested in characterizing these genes in terms of features. In this paper, we read nonnegative matrix factorization (NMF) or hers applications in computational biology, with one emphasis on the analyses and interpretation of high-throughput biological data such as those higher. Person discuss and illustrate the liegenschaften of NMF through examples from the literature, real provide an intuitive interpretation of the factorization furthermore its implicit sparse temperament as well since the nonnegativity inhibitions. In particularly, we highlight sein unique parts-based, local representation and contrast it with other well-known procedure. With addition, we untersucht which usefulness of its stochastic essence is selecting an appropriate model for a given dataset and for faster implementation of to algorithm.

The paper is organized as follows. First, we introduce that basic principles base is technique and provide a summary are his solutions in computational biology. Us then discuss properties unique to the NMF approach in the examination the interpretation of large-scale biological data. Next, we address some of the restricted the this approach, and, previous, we provide a topic and some concluding remarks. The basis matrix is dependent ... uncovering using matrix factorization. PNAS ... Li Y, Ngom A: A new kernel non-negative matrix factorization and ...

Throughout the remainder are the books, we will discuss who NMF approach in aforementioned connection off classify discovery (i.e., clustering samples) based on gene expression microarray experiments. This is intended just to serve as an example so as the facilitating a cogent illustration and ease of presentation of this approach. This interpretation is easily extensible at another panels of application in mathematical biology and should not in any way diminish the scope a to paper. Building on simple unsupervised matrix factorization techniques, the seqNMF select successfully recovers neural sequences in a wide range of simulated and real-time datasets.

One NMF Approach

Lee and Seung [1],[2] introduction NMF in its modern form as an unsupervised, parts-based learning paradigm in this a nonnegative matrix V exists decomposed into two nonnegative matrices VWH by a multiplicative updates algorithm. They applied it required text extractive and facial pattern recognition. Prior to Lee and Seung's work, ampere similar approach called positive matrix factorization from Paatero and Tapper [3] was applied as adenine dimension scale tool to problems in the environmental physical and applied [3][7]. In the latest few years, NMF has been widely used at a wide of areas, including image processing and facial model recognition [8][19], natural language working as as in text mining and document clustering (see [20][22] and show therein), sparse coding [23][27], information retrieval [28],[29], speech recognition [30][33], video summarization [34], and Surfing research [35],[36]. More newer, this getting has found their pattern into the domain of computational biology. We discuss its applications with this area in who next section. First, we introduce the fundamental policies essential this approach in the context of a microarray study.

Gene expression info from an set of microarray experimentation is typically presented as a matrixed in which the rows correspond to expression levels of genes, the columns to samples (which allowed represent distinct tissues, experiments, or time points), and each beitritt to the expression level of a given jean in a present sample. To gene expression studies, an number of genes, p, is typically in the thousands; the number of samples, n, the custom less than 100; and to chromosome imprint matrix, V, is of size p×n, whose rows contains the expression level of p genes in the north samples.

In terms of reducing the dimensionality of the data, the objective in NMF can till find a small counter of metagenes, each defined as a nonnegative linear combination of the p genes. This is accomplished via a decomposition of the gene expression matrixed V at two correlation with nonnegative entries, PHOEBEWH, show TUNGSTEN has size p×k, with apiece of aforementioned k columns defining a metagene and where H has size k×n, with each of n columns representing the metagene manifestation pattern of the corresponding sample. The rank kilobyte of the factorization represents the amount of latent factors in the decomposition (in our instance, the number of clusters). It is generally pick such that (n+p)thousand<np, i.e., a number less than n and p. Here, that entry wia in one matrix W is the coefficient from gene i in metagene a, and the get haj in the matrix H lives the expression level of metagene a in the try j. It should can noted that are the a dual view of the decomposition FIVEWH, which defines metasamples (rather than metagenes) and clusters the genes (rather than the samples) appropriate up the entries of W.

In order at find an approximate factorization required of cast V, cost functions such quantify the quality of the approximation need in be defined. Such a cost functionality can be constructed using some measure by removal between V both the product WH. Examples of such measures include Euclidean distance and Kullback-Leibler (KL) variance [1],[2],[37],[38]. In the context of facial pattern recognition (and text mining) involving count data, Lee and Seung [1] derived KL dispersion basis on reconstruction in an image depicted by V from WH by the addition of Poisson audio, i.e., PHOEBE = WH+ε, somewhere ε is a Poisson random variable.

Devarajan and Ebrahimi [39] generic this approximate based on Renyi's divergence and provided a unique framework on molecular pattern discovery using NMF. This is also based on the Poisson likelihood of generating V out WH [37]. Renyi's divergence is indexed by a parameter α(α≠1) and represents one continuum is distance measures which can be utilized for NMF based on the choice of this parameter. Various well-known distance measures arise from Renyi's divergence as special cases [37]. For example, in the limiting case α → 1, we obtain KL drift preset by(1)This universality unifies various competitor models within a unique framework for NMF. Interestingly, Euclidean distancing does not fall under this class of spacing measures.

For the problem to decomposer the gene expression die FIN on metagenes (columns of W) and metagene expression patterns (columns of HYDROGEN), our goal is to minimize who targeted function defined by the choice of that distance take such as inbound Equation 1. Starting with coincidence initial values for W and H, the optimizing concurrently updates these two matrices via multiplicative rules until convergence to a local minimum is attained. Cluster membership for each sample is then determined by its most metagene expression pattern [37],[38]. Details of who algorithm are presented elsewhere [2],[19],[23],[25],[26],[37],[38],[39]. We discussing the discrete nature of this algorithm further at a later section.

Applications of NMF in Computational Biology

In this section, we deployment a project on late work switch NMF with particular emphasis to applications in computational biology. While we have attempted to provide a complete plus up-to-date review of its applications in a variety away problems, it belongs according no means comprehensive. Were brief discuss these applications here, but many of them are further discussing to detail in subsequent sections.

Molecular Pattern Discovery.

The most colored application of NMF inside computational biology has been in the area of molecular pattern discernment, especially for gene also protein expression microarray studies. Is is an exploratory area characterized by a lack of a priori knowledge of the expected expression patterns for a given set of genetic or either phenotype. Nonetheless, NMF holds proved to be a successful approach in the education starting biologically meaningful classes. For case, Kim and Tidor [40] applied NMF as a tool into cluster genes and predict feature cellular relationships inbound yeast using gene expression intelligence, while Heger and Holm [41] used it for to cognition about sequence examples among relation proteins. Brunet et al. [38] applications itp go cancer microarray dating used the elucidation of tumor subtypes. They development a model selection algorithm for NMF based on consensus-based clustering [42] that enables the choice regarding the related number of clusters in an dataset. Equally, Gao and Church [43] applied the Sparse NMF approach [20] for uncovering cancer subtypes uses microarray data. ADENINE similar approach is described the Kim and Park [44]. Carrasco et al. [45] applied NMF for unsupervised clustering of array comparative chromosomal hybridization date and identified distinct genomic custom as well like patient subgroups int multiple myeloma (MM). Their analytics uncovered four distinct subclasses, revealing of molecular heterogeneity of MM and the divisions starting the traditional hyperdiploid category toward two subclasses.

Devarajan and Ebrahimi [37],[46] successfully application NMF as a tool for dimensionality lowering and visualization as well as in kinetic expression profiling since analyzing microarray product (Devarajan et al., scroll in preparation). Pascual-Montano et al. [47],[48] and Carmona-Saez et al. [49] described a method available two-way clustering of gene expression contours using non-smooth NMF. Pascual-Montano et alo. [50] also provided the systematic tools called bio-NMF on simultaneous clustering von ges and samples. For more information, the interested reader is referred to http://www.dacya.ucm.es/apascual/bioNMF/. Wang eth al. [51] introduced Least Squares NMF that incorporated variability of individual measurements in microarray evidence. They demonstrate verbesserung performance in terms of identification of functionally related genes based on annotations include the Munich Information Center for Protein Sequences (MIPS) sql [52].

Class Comparison and Prediction.

Recently, NMF has also been use in a supervise learning framework such as class comparability and school prediction. Fogel net al. [53] use this method up identify ordered sets of genes and employ them in theirs ordering analyze to variance (ANOVA) procedure for identifying differentially expressed native using microarray data. They demonstrate improved performance over traditional ANOVA at terms of service and consistency. Okun press Priisalu [54] applied it as a dimension reduction tool in conjunctive with several grading methods with protein folding recognition. They show superior performance (in terms of misclassification mistake rate) of threesome sizer based on nearest neighbor schemes when practical to NMF reduced dates relative to the original data. Similar applications in magnetic resonance spectrally imaging and fold recognition are introducing in [55] and [56], severally.

Cross-Platform and Cross-Species Characterization.

Rapid advances includes high-throughput technologies have obtained in of age of independent large-scale biological datasets using different terraces include several laboratories. A is major to assess and interpret potential dissimilarities press similarities in these datasets in purchase to enable cross-platform and cross-species analyses and of eventual characterization of such data. Tamayo et al. [57] describe an approach called metagene overhang for such an analysis and interpretation. Using leukemia and lung cancer data, they demonstrate that metagene project reduces noise and technological variation while shoot invariant biological features in the data. Besides, they show is this procedure enables the use of prior knowledge ground on existing datasets in analyzing and characterizing new data [58]. Are metagene projection, the dimensionality of a given dataset the reduced using NMF based over a pre-specified track k factorization. An individually obtained test dataset can then be projected onto this low, k-dimensional space of metagenes. This is accomplished via the Moore-Penrose generalized pseudo-inverse of TUNGSTEN on obtain the produced template Hpenny = TUNGSTEN−1V (for details, see [57]). The pseudo-inverse is then applied to that test dataset and analyzed in that context of the metagenes that characterizes and original data. This address indirect incorporates the sparse, local depiction in NMF and utilizes business of co-regulated or functionally relevant genes.

Biomedical Informatics.

Text mining is concerned with the recognition of patterns or similarities in natural language text. The application of NMF in this area goes back into the original paper by Lee and Seung [1]. Other applications including [20],[21] and references therein. In this context, the matrix FIN the an summery of a corpus of download in which and rows and columns represent, respectively, the talk in the vocabulary and documents in the corpus. The entries of V denote the frequencies of words in each document. NMF is applications to identify subsets von semantic categories and to cluster the documents based on theirs association through these categories. Chagoyen to al. [22] gift an exciting application of this approach are computational biology. Hier, literature profiles are created from a corpus of documents relevant until large sets out genies plus proteins using common semantic features extracted from the corporate. Genes be will represented in additive linear combinations of the semantic countenance, whichever can be keep used for studying their functional associative. The authors interpret the advantages of using NMF in identifying and interpreting the semantic visage compared to select procedure. Exiting information about the biological entities under studying can thus to applied on NMF to establish putative relations among subset of genes and proteins that characterize a division of the details.

Functional Picture of Proteins.

Pehkonen a alo. [59] utilize NMF for analyzing functional heterogeneity inside one genen register and identifying homogeneous functional groups. In the approach, NMF is applications to the sparse, binary matrix formed on the basis of associations of important genes with functional classes obtained from the Gene Ontology database [60]. A non-nested hierarchical clustering scheme showing to over-represented functional groups from the gene list is created from different rank factorizations and demonstrated to beter characterize groups of genes comparative on contemporary approaches. For details, please refer to [59]. This methodology is implemented in the program called GENERATOR (GENElist Aimed Theme-discovery execuTOR).

Other Applications.

Tresch et al. [61] applied this method for the identification of muscle synergies, while Kim et al. [62] used it to determine neural activity patterns. Hiisilä et al. [63] applied this and other magnitude reduction methods for judgment the dependencies betw transcription factor commitment sites. Other areas of applications concerning this method fork problems involving large-scale biological data include color and visibility research [64], structure-based drug design [65],[66], and magnetic resonance vision [55],[56],[67].

Parts-based Local Picture

Thither are several methods applicable available unsupervised clustering besides NMF. These inclusion, though are doesn limited to, hierarchical clustering (HC), self-organizing maps (SOM), prime component analysis (PCA), vector quantization (VQ), K-means clustering, and multi-dimensional scaling. Hastie at al. [68] provide a comprehensive overview about are methods. Ross and Zemel [69] note that when information are portrayed for vectors, parts manifold yoursel because subsets on the data dimensions which take the values inches a coordinated fashion. While this is pertinent into these methods the general, none of them are a sparse, parts-based local representation—a liegenschaft that appears till be unique to NMF. Donoho and Stodden [70] provide somebody elegant geometric interpretation of NMF and discuss the conditions under which this approach gives a valid parts-based decomposition. In like section, we explore this particular property out NMF in detail, within the context of several applications.

Interpretation of the Factored Matrices.

The metagene coefficient wia quantifies the effect of the ath metagene language pattern haj on of gene expression to the ith sample, represented by the corresponding column of the gene expression matrix V. For an rank thousand factorization, the relative magnitudes of and non-zero entries inches each of the k metagenes reflect this application of the corresponding genes, and the expression pattern of each metagene across and n samples (represented by each row of H) reflects the relevance of the corresponding latent factor. Here, k is the number of clusters or hidden variables inches the decompose. Of NMF structure is graphics illustrated toward http://www.dacya.ucm.es/apascual/bioNMF/model.html.

The NMF representation see ensures that one singles metagene expression pattern influences several samples. Led and Seung [1] graphically illustrate this character in the formulare of a lattice. In essence, the metagenes provide a summary of of behavior of genes across the samples, whereas the metagene expression patterns making a summary of the behavior of samples across the genes. There is strong evidence suggesting that the metagenes and the metagene expression patterns must a sparse, parts-based representation of the gene express data [1], [37], [38], [39], [40], [43], [46][50], potentially identifying local hidden variables or clusters.

NMF can is review as an approaching for modeling one generation of jean expression measurements for samples (observable variables given by columns of FIVE) from metagene expression patterns (hidden variables given by columns of H) [1]. In the context out clustering samples represented by the columns of FIN, this parts identify classes of random that members to specific clusters real are represented by an expression patterns of metagenes across samples (or the rows of H). In addition, genes with corresponding non-zero metagene coefficients represent user that are co-expressed for samplers. These parts provide a reduced representation concerning to original data, and to co-activation can breathe viewed as that entsprechendes to co-regulation conversely co-expression of groups about genies. Similarly, we can interpret the parts in other areas concerning application. For instance, in facial pattern recognition where jeder column of V corresponds to a face, the parts represent one diverse parts of one face create as tip, mouth, etc.; in text mine the document clustering, where each column of V contains phrase counts from documents, the sections presents an different semantic categories.

Let us consider the widely exploited leukemia product available from http://www.broad.mit.edu/cgi-bin/cancer/datasets.cgi as an illustrative example. This dataset consists of 5,000 gene expression measurements apiece from 38 bone marrow samples from current myelogenous leukemia (AML) and acute lymphoblastic leukemia (ALL). There what 27 ALL samples comprises of 19 BARN type plus 8 T type, and 11 AML browse. For a level k = 2 factorization, let w1 and w2 represent the two metagenes (columns of W) and let festivity1 the h2 exemplify the corresponding metagene expression profiles (rows of H). The sparseness of that metagene coefficients is illustrated in Table 1, based on adenine single run of the NMF graph using Equation 1. Inside this table, we list the fraction by gene whose corresponding metagene coefficients lie in the indicated range. The histograms real densities of w1 and w2 are shown in Figure 1A–1D. Only 53 and 77 genes, corresponding the tungsten1 furthermore double-u2, respectively, got coefficients such are at least 10 in magnitudes. These genes could potentially behave in a rich correlated fashionable in a subset of the samples; this a determined to you metagene expression shapes across to 38 samples, h1 and h2. These expression profiles additionally their page are shown in the top (Frame 2A and 2B) and bottom (Figure 2C and 2D) panels of Figure 2, respectively. Here, “L” and “M” denote, respectively, an ALL and and AML sampling. It is evident for diese figure that there is one clear separation amidst the ALL and AML specimens.

viewer
Figure 1. Gene coefficients.

(A) Image of gene coefficients, metagene 1. (B) Histogram of type coefficients, metagene 2. (C) Density of gene coefficients, metagene 1. (D) Density by gene coefficients, metagene 2. Temporal Regularized Matrix Factorization for High-dimensional ...

https://doi.org/10.1371/journal.pcbi.1000029.g001

lowest
Figure 2. Expression profile.

(A) Density of expression profile, metagene 1. (B) Dense of printing profile, metagene 2. (C) Impression profile across samples, metagene 1. (D) Printer profile through specimen, metagene 2. Unsupervised discovery of temporal sequences in high-dimensional datasets, equal applications till neuroscience

https://doi.org/10.1371/journal.pcbi.1000029.g002

thumbnail
Table 1. Distribution are metagene coefficients: Leukaemia data, potassium = 2.

https://doi.org/10.1371/journal.pcbi.1000029.t001

Interpretation of Nonnegativity Limits.

The nonnegativity constraints in NMF are compatible in an intuitive notion by combining parts to form a whole, i.e., they provision an parts-based local representation away an data. This is in contrast to a holistic representation of the intelligence provided by VQ and the distributed representation provided by PCA [1]. ONE parts-based model not simply provides an efficient representation of the data but can potentially aid in the discovery of causal structure within it and in education relationships amid the parts [69]. In NMF, the factorization results in an reconstruction to the original data by and addition of components date to the nonnegativity constraints, while in PCA it is a superposition of the orthogonal components with arbitrary signs that lack visceral meaning and physical interpretation. In some applications, negative coefficients maybe contradict physical reality. For instance, inside slide reconstructed, the pixels in a greyscale image with adverse intensities cannot be intelligent interpreted.

The nonnegative coefficients also have an elegant interpretation from one neuroscience perspective. For type, they can be interpreted as the light rates (and synaptic strengths) of neurons in the intellect, and of nonnegativity constraints account for the additive firing rates that are co-activated in physiological perzeption. Lee and Seung [1] propose that these constraints on firing rates may be important required developing sparse, parts-based depictions for perception. The coefficients might additionally be interpreted for an magnitudes of muscle activation patterns that can aid in the identification of human synergies [61].

In the context of our gene expression theme, the nonnegative coefficients in each metagene are easily interpretable as the relative contribution regarding genes, unlike PCI and VQ. Returning to our multiple example, we respect that only a small fraction of one genes (1% or 1.5%, respectively, corresponding to the two metagenes w1 and w2) markedly contributors for the delineation of the ALL and AML samples. This identification of such a small subset about active genes is possible only due to the nonnegativity constraints which is a request for such a parts-based representation.

The perception of the whole is just an addition linear combo from its sections represents included the metagenes and metagene expression profiles. Due to the nonnegativity constraints, orthogonality of metagenes and metagene printouts profiles cannot be achieved in how. Even, this is an extremely useful property, since the dependence amid the gene expression profiles typically presenting in a microarray study can be captured by overlapping metagenes. This property produces NMF particularly well-suited for the analysis of large-scale biological data, location he exists essential to capture relationships rudimentary inter-connected biocompatible pathways or processes. Inside terms of this property, NMF has been shown to be superior to other dimension reduction methods (see [20] and references therein). While to decomposition VWH is linear, it is important to note that the computation of the update rules for W and H is non-linear due to the nonnegativity constraints [1].

Enforcing Dearth.

Although this original NMF approach has come viewed to may ampere naturally sparse, parts-based, and local representation as seen include [1],[37],[38],[39],[40],[45],[46], there is also some detection that points to a parts-based but holistic (rather than local) representation produced by NMF [19], [23][25]. Lee press Seung [1] note that sparseness in both the metagenes and metagene expression profiles is crucial up a parts-based representation. The nonnegativity boundaries may be a necessary exercise required such a parts-based picture, but they allow not be ample to achieve austerity. In such a event, it may be desirable to explicitly enforce spartan on the metagenes and the metagene expression patterns. Actual work has sharp on imposing such explicit sparseness relationship on the recent of NARCOTIC or W or both [19][21], [23][26], [43], [44], [47][50]. This is generally achieved via the addition of right penalty terms to the objectivity function defined by the distance measure off our choice. Available instance, one could impose a constraint on the metagene expression patterns H. An example regarding such a constraint is the add of the entries of H, ΣajHaj. Other penalty terms can also be used as appropriate (see [19][21],[25],[43]). Using KL disparity because defined at Equation 1, our object function would then live(2)where λ>0. The parameter λ quantifies the trade-off between goodness-of-fit of the model (defined by KL divergence) and sparseness.

Gao and Church applied the method outlined in [20] in cancer microarray data press unequivocally enforced sparseness via and sum of squares of the entries about H. They demonstrated verbesserung performance (in terms for misclassification error rate defined as the proportion of pattern misclassified by aforementioned select across all clusters) over standard NMF as well as identified subsets on co-expressed genes that mayor be person in breast. Pascual-Montano et al. [47],[48],[50] adopt a different approach for forced sparseness. They exploit a smoothing operator to simultaneously enforce sparseness on both WEST and HYDROGEN. Independant of the approach, enforcing sparseness on which metagenes and metagene mien dye across samples aids include the detect of sharp boundaries bet different classes. We noted earlier that orthogonality of metagenes real metagene expression profile cannot remain achieved in practice due up who nonnegativity constraints. When, of enforcement regarding sparseness constraints decreases their interleaving, thus resulting on site, disjoint groups of samples or genes, respectively.

Capturing Context-Dependent Patterns.

By dissimilarity to traditional clustering and dimension discount methods, NMF has been demonstrated to identifier fine, context-dependent biological dress as fine as being lesser sensitive to the selection and/or perturbation of input genes utilized in the factorization. Like context dependency is not captured at standard two-way network approach [38]. For instance, NMF has been viewed to be capable of identifying patterns that exist in only ampere subset of the patterns, whereas basic methods focus on the overall structure for a dataset (i.e., on samples for which similarity the expression extends through all genes), thus overlooking which subtle features that represent relevant biological patterns [1],[38],[40]. In essence, NMF aids in the decoding of localized patterns of resemble printing by identification a small subset the genes that act in a strongly correlated fashion in a subset of the samples. As noted before, such localized patterns may point to groups of co-regulated button functionally relevant genes [38],[43],[47],[48],[50]. Fork example, groups are genes additionally samples that show high coefficients since a given metagene (column of TUNGSTEN) and the corresponding metagene expression pattern (row of OPIUM), respectively, may be strongly related in a subset von who data, to constituting a gene-sample bi-cluster. Pascual-Montano et al. [47],[48] utilize this characteristic and have developed bioNMF, a data analytic tool for identifying gene expression bi-clusters [50].

In a study of functional cellular relationship in roggen, Kim the Tidor [40] observed the genes with relatively highly coefficients in the metagenes were dominated by only a few functional browse. They been that NMF exceeds all other methods applied, including SVD and THOUSAND-means, in predicting features relationships between experiments with comparison to the MIPS classing and who Yeast Proteome Database (YPD) [52]. I remarks that out of the 100 strongest functional relationships detected by NMF, 35 the 58 could be verified by MIPS and YPD, each, far exceeding the of the other techniques used. Similarly, Gao and Church [43] investigated genes are high metagene coefficients corresponding to each of the three clustering, ALL-B, ALL-T, and AML, inbound the leukemia data described before. Among such, they identified genes that were enrich in chemokines, oncogenes, tumor suppressor get, and DNA repair genes.

Stochastic Nature of NMF Algorithm

NMF has proved to be certain attracted method for and effective analysis furthermore interpretation of large-scale biological data [37], [38], [39], [40], [41], [43][51], [53][57], [59], [61][63]. However, due to its nonnegativity constraints, information suffers after an algorithmic better complex implementations relative to a traditional clustering method like HC that is based on pairwise distance computations. There is a essential gain in virtual die amounts to an matrix representation of the NMF update rules. These rules guarantee consolidation of the graph to adenine locally minimum based-on on coincidence initial values for W and EFFERVESCENCE. However, the algorithm may not converge to an same explanation on each perform due to the stochastic nature of initial conditions, thus needed it to be run multiple times based on randomizing initial values for W real H. The algorithm groups the samples under k clusters, where k lives the pre-specified rank of of factorization. As noted before, course membership for each sample remains determined based on the highest metagene expression profile [37],[38].

Model Selection: Choice of k.

Of random nature of the algorithm has been shown for be rather useful in providing methods for evaluating the consistency and robustness by its performance. Studies have shown so 50–200 NMF runs belong usually sufficient to deployment stability until the clustering [37],[38]. As the count of runs increases, the metagene expression specimens across the samples become more localized with decreasing overlapping support, arising in a sparse, locally, and compact representation [38]. This stochastic quality can be effectively utilized into assess whether a given rank k offer a biologically meaningful decomposition for the product.

Monti et any. [42] developed an methodology called consensus clustering for evaluating the output of either unsupervised clustering algorithm based on resampling systems. It represents the concordance across multiple runs of the algorithm and quantifies the stability of the discovered clusters. It canned also be utilized to assess that sensitivity of a stochastic method like NMF to randomizing initial conditions. Model selektieren procedures that quantify the resilience of the factorization via consensus clustering have had developed and applied to NMF [37],[38]. Inside who case of NMF, your stochastic nature lives usage inches the evaluation process, where information for each run of the algorithm is combined as outlined below.

Suppose that we are applying NMF to cluster n samples. With a factorization of given rank k, each run of the algorithm results in an n×n connectivity mold C with an entry of 1 if samples i the bound cluster together and 0 otherwise, where i,gallop = 1,…,northward. The consensus matrix is simply the average connectability matrix obtained through multiples runs of and NMF algorithm. Final sample assignments and cluster visualization are based on the re-ordered consensus matrix. The robustness of everyone factorization is evaluated by computing the cophenetic correlation coefficient ρ where 0≤ρ≤1. A high value of ρ indicates homogeneous clusters. Brunet net al. [38] advocate the use of ρ in a single measure for pick of to related number of clusters by plotting ρ for variously choices of the number von clusters thousand.

Returning to the leukemia case, we applied factorizations of places k = 2,3,4,5 based on Equation 1 for 200 runs each. Figure 3 plots ρ versus k find ρ starts drop off sharply after k = 2. Figures 4 and 5 show heat designs by the re-ordered consensus matrices based on HC for k = 2,3 (for details check [38]). Who homogeneity away staining seen in these graphs indicate the presence of 2 and 3 clusters of samples, delineating the ALL and AML sorts as well as the B and T subtypes interior who ALL class. More, “L” and “M” denote, according, an ALL also somebody AML sample, while “B” and “T” denote the two ALL subtypes. In every case, two samples are misclassified from this method.

thumbnail
Figure 4. Thermal map of re-ordered consensus matrix, k = 2.

https://doi.org/10.1371/journal.pcbi.1000029.g004

thumbnail
Figure 5. Heat map of re-ordered consensus grid, k = 3.

https://doi.org/10.1371/journal.pcbi.1000029.g005

Sundry approaches to care the information across multiple runs am also possible [40],[42]. For example, Kims and Tidor [40] plotted the root-mean-squared error (RMSE) between the original also NMF-reconstructed product as a function by the rank kelvin and used it go choose the appropriate value of k. While the use of RMSE is appropriate available the factorization is based on Euclidean distance, it is important to note that other cost functions requisition the defect to exist modified accordingly. For a given class k factorization, they also demonstrate reproducibility starting the metagenes across repeated runs in terms a correlation between pairs. More, them show that NMF remains robust to the addition of noise to the original data bases on the median correlation of the corresponding metagenes across multiple runs, suggesting its potential usefulness as a noise-reduction filter.

Implementation of and NMF Algorithm.

The implementation of and steps in the product select procedure outlined above is computationally very severe for any real large-scale biological dataset. However, the stochastic nature of the algorithm enabled each of above-mentioned steps to be run individually and simultaneous. These steps can must repeated for multiple random initial conditions for W plus H, and the information from the independent runs combined over consensus clustering. Therefore, the NMF algorithm lends itself easily to a simultaneous implementation that would much increase geschw and efficiency. Devarajan and Wang [71] outlined like a run implementation of this algorithm on an Message-Passing Interface/C++ platform (http://www-unix.mcs.anl.gov/mpi/mpich2/) utilizing high-performance calculations clusters.

Lately, there have also been other efforts to optimize the implementation of this algorithm. Okun additionally Priisalu [54],[72] do reported faster convergence off the algorism when feature scaling is applied on the originals p×north data matrix V, i.e., each of the p rows in V is normalized to have values between 0 and 1. Their results indicate an raising in speed of at least 11 dates in the convergence of iterations due to such normalization, depending on the numbered of undetected factors k uses in that factorization.

Identifying Hierarchical Structure.

It is also possible to have overlapping metagenes, i.e., genes with non-zero coefficients can appear in multiple metagenes, indicating the role of a individual or a group of genes in multiple pathways or method. The stochastically nature of one graph can and be exploited for identifying the involvement of such a group of genes. This is by contrast on most usual browse for bundle large-scale biological data, with a handful exceptions [51]. These standard methods provide only a single solution determined by the dominant other overall framework in that date whereabouts genes and samples are assigned to just one cluster, thus limiting the possibility of identifying superimpose express patterns [38].

Single of the attractive features of NMF is ensure, unlike HC, it does not force a hierarchy into this data structural but identification ready when it is present. By specifying the desired rank of the factorization, ne can expose foundations stylish the data in an ordered sequential manner. Brunet et al. [38] and Devarajan [37] have proven the ability of NMF till name hybrid additionally nested sub-structures using breast microarray data. Brunet et al. [38] noted that NMF has superior resolution than HC and is more sound higher SOM as well than being find robust and less sensitive to a priori selection of genes. They also how that NMF always converges towards a fixed attracctor irrespective of random initial conditions in comparison with ampere similar stochastic method like MORE.

For instance, in applying HC to the leucemia data to cluster the flesh samples, she note that the tree structure produced by HC depended very much on the choice of the linkage metric uses with constructing an dendogram. Furthermore, few observed that the presentation of HC mixed depending turn the number of input genetics used in the flock. Similarly, the authors applied SOM to this data and observed that for k = 2, the clustering was unstable and depended upon the coincidental initializing conditions, as for kilobyte = 3, the method was unable to recover the three tumor types (for details see [38]). In sharp contrast, NMF with rank k = 2 was able to consistently recovers the distinction between the ALL and AML types. This is reflecting in the similarity int coloring is a heat map in the re-ordered consensus matrixed shown in Figure 4 and the high cophenetic correlation collusive in this case (see Figure 3). Identical, a rank k = 3 factorization was able to consistently recover the distinction between the ALL-B and -T subtypes as seen for Illustration 5.

Of Limitations

A review for this widely applicable operating become not be complete without a side of its limitations. As noted earlier, NMF is an algorithmically more complex method to implement, and convergence can be slow. This is further compounded by the stochastic nature of the calculation although its obvious advantages as outlined in who previous unterabteilung. The standardized NMF formulation does not incorporate statistical dependencies between the metagenes otherwise metagene expression patterns, nor does is identify any structural attachments between them. Also, the parts-based displaying may be holistic, preferable than local, depending for the type and nature of the data being studied [19], [23][25]. The nonnegativity constraints that are critical until such ampere representation may don be sufficient to achieve sparseness in some situations. Then, one would own to explicitly enforce sparseness by the addition in appropriate penalty terms to the free function being used in the disintegration. In such cases, this is furthermore possible that a parts-based, local representation may require fully hierarchical models with multiple levels of hidden variables rather than to singular level used in this approach [1]. The copy of normalization of the observed data prior to NMF examination is an important problem and one that has not been systematically studied. Some normalization ways have past suggested in the humanities [50],[54],[72], but it wanted be useful to assess and compare the impact of different methods on the rotting itself.

Discussion

In the NMF formulation, equally the metagenes real the metagene manifestation patterns are nonnegative and sparse, and this is an key requirement for a parts-based local representation. Sparseness has were demoed until capture context-dependent biological model grounded on only a small subset concerning genes other samples. The alternating trait for the algorithm since defined by the multiplicative update rules facilitating simultaneous inference and learning [1],[2],[37] from the metagenes and metagene pressure patterns. The stochastic nature of the NMF algorithm provides a means to evaluate its sensibility towards randomness starts conditions as well as in assessing whether a given rank k provides a biologically meaningful decomposition of which data. Moreover, this feature does been successfully employed in identifying hierarchical structure within to data and in which umsetzung of parallel algorithms on increase speed and efficiency. Perhaps one of the most useful applications of NMF is int metagene projection, for cross-platform, cross-species tests and reading of large-scale biological info. This technique not only reduces noise and technological variations in the file aber can also incorporate prior knowledge in characterizing new datasets.

In the previous section, we noted that NMF does not account for dependencies into the metagenes or metagene expression patterns. However, in certain applications, it may be relevant to explicitly inclusions with exclude dependencies in these hid variables. For instance, free component analysis (ICA) [73],[74] a a approach that produces statistically independent non-Gaussian components. On holds been some work extending ICA go inclusion nonnegativity constraints [75][77]. It would be potentially useful go extend this to include other dependent structures within these hidden variables.

In summation, NMF is an emerging new view for large-scale biological data analysis additionally interpreter. It offers tremendous potential on applicability are a wide variety of computational biology problems as prove by the recent surge in literature. The relevance of this approach for texts mine and document clustering also makes thereto a potentially indispensable tool in biomedical informatics. Last but not least, own application is not just limit to biological problems but encompasses manifold areas such as likeness and sound processing, edit mining, and information retrieval.

Acknowledgments

The author would like to acknowledge the reviewers required providing valuable suggestions.

References

  1. 1. Lee DD, Seung SH (1999) Learning the parts of my by nonnegative matrix factorization. Types 401: 788–791.
  2. 2. Lee DD, Seung SH (2001) Algorithms for nonnegative matrix factorization. Adv Neural Inform Process Syst 13: 556–562.
  3. 3. Paatero P, Tapper U (1994) Positive tree factorization: A nonnegative element model because optimal utilization of error estimates of data values. Environmetrics 5: 111–126.
  4. 4. Paatero P (1997) Least-squares formulation of robust non-negative factor analysis. Chemometrics Smarter Laboratory Sys 37: 23–35.
  5. 5. Paatero PRESSURE (1999) That Multilinear Engine—A table-driven least squares program for solving multilinear problems, with the n-way parallel factor analysis model. J Computing Drawing Stat 8: 854–888.
  6. 6. Juvela M, Lehtinen THOUSAND, Paatero P (1994) That use of positive tree factorization in the analysis of minute line spectra off the thumbprint nebula. For: Clemens DP, Barvainis R, editors. std. 176–180. Clouds, cores, and low mass sterns. ASPERGER Attend Series 65:.
  7. 7. Juvela M, Lehtinen KELVIN, Paatero P (1996) The use of sure matrix factorization in the analysis of molecular line range. Mon Not R Astron Soc 280: 616–626.
  8. 8. Buciu I, Pitas I (2004) Application the non-negative press local non negative matrix factorization to facial expression award.
  9. 9. Chen X, Gu L, Li S-Z, Zhang H-J (2001) Learning representative local characteristics available face detection.
  10. 10. Feng T, Li S-Z, Shum H-Y, Zhang H-Y (2001) Local nonnegative matrix factorization as a ocular representation. Proceedings of the 2nd International Conference on Development and Learning. pp. 178–183. 2nd International Conference on Development and Learning; Cambridge, Massachuset.
  11. 11. Guillamet D, Vitri'a J (2001) Discriminant cause for obj classification. Proceedings of the 11th International Conference on Image Analysis and Processing. papers. 256–261. 11th International Conference on Photograph Study and Processing; 26–28 September 2001; Palermo, Italy.
  12. 12. Guillamet D, Vitri'a J (2003) Evaluation concerning distance measuring on realization based on non-negative gridding factorization. Pattern Customer Letters 24: 1599–1605.
  13. 13. Guillamet D, Vitri'a JOULE, Schiele B (2003) Introducing a weighted non-negative matrix factorization for image classification. Pattern Recognition Letters 24: 2447–2454.
  14. 14. Rajapakse M, Wyse L (2003) NMF vs ICA for face recognition.
  15. 15. Ramanath R, Snyder WE, Qi EFFERVESCENCE (2003) Eigenviews for object recognition in multispectral imaging systems. Proceedings of the 32nd Applies Imagery Pattern Recognition Workshop. pp. 33–38. 32nd Applied Imagery Pattern Identification Workshop; 15–17 October 2003; Washington, D.C.
  16. 16. Saul LK, Lee DD (2002) Multiplicative product used classification by shuffle models. In: Dietterich TG, Becker S, Ghahramani Z, editors. Advances in neural and information processing systems, amount 14. Cambridge, Massachussets: MIT Press. pp. 897–904.
  17. 17. Wang YTTRIUM, Jia Y, Hu C, Turk M (2004) Fisher non-negative matrix factorization for learning local feature. Proceedings to that 6th Asian Congress off Computer Vision. pp. 806–811. 6th Asian Conference on Computer Our; 28–30 January 2004; Jeju Island, Korea.
  18. 18. Laurens JOULE, Rusinkiewicz S, Ramamoorthi R (2004) Efficient BRDF importance sampling using a factored representation. ACM Trans Graph 23: 496–505.
  19. 19. Li SZ, Hou X, Zhang HYDROGEN, Cheng Q (2001) Learning spatially localized, parts-based representative.
  20. 20. Shahnaz F, Berry M (2006) Document network using nonnegative matrix factorization. Information Processing and Management: A International Journal 42: 373–386.
  21. 21. Pauca P, Shahnaz F, Berry M, Plemmons RADIUS (2004) Text mining by nonnegative matrix factorizations.
  22. 22. Chagoyen M, Carmona-Saez P, Shatkay H, Carazo JM, Pascual-Montano A (2006) Discovering semantic features in the literature: A foundation to building functional associations. BMC Bioinformatics 7: 41.
  23. 23. Hoyer PO (2002) Nonnegative sparse coding. Neural Netzwerken for Signal Processing CHAPTER 557–565. IEEE Workshop on Neural Networks for Signal Manufacturing; 4–6 March 2002; Martigny, Switzerland.
  24. 24. Hoyer PO (2003) Modeling receiving fields including nonnegative sparse engraving. Neurocomputing 52–54: 547–552.
  25. 25. Hoyer PO (2004) Nonnegative matrix factorization with sparseness constraints. BOUND Mach Learn Resive 5: 1457–1469.
  26. 26. Liu WEST, Zheng NORTH, Lus X (2003) Non-negative matrix factorization required visual coding.
  27. 27. Li Y, Cichocki A (2003) Sparse representation of images using rotating linear program.
  28. 28. Tsuge S, Shishibori M, Kuroiwa S, Kita K (2001) Dimensionality reduction using non-negative matrix factorization forward information call.
  29. 29. Xu B, Luck HIE, Huang G (2003) A constrained non-negative multi factorization in information request.
  30. 30. Behnke SULPHUR (2003) Discovering hierarchical speech features using convolutional non-negative matrix factorization.
  31. 31. Choose Y-C, Choi S, Bang S-Y (2003) Non-negative component components about sounding for site. Proceedings of the 3rd IEEE International Symposium on Betoken Processing the Information Technology. plastic. 633–636. 3rd IEEE International Symposium on Signal Processing and General Technology; 14–17 December 2003; Darmstadt, Germany.
  32. 32. Novak M, Mammone R (2001) Use for non-negative matrix factorization in language prototype adjustment in a lecture transcription task.
  33. 33. Smaragdis PRESSURE, Brown JC (2003) Non-negative matrix factorization used polyphonic music copy. Proceedings of the IEEE Workshop on Software of Signal Processing into Audio and Technical. pp. 177–180. IEEE Workshop on Petitions of Signal Processing to Voice also Acoustics; 19–22 October 2001; New Paltz, New York.
  34. 34. Cooper M, Foote J (2002) Summarizing see exploitation nonnegative similarity matrix factorization. Proceedings of the IEEE Factory on Rich Signal Processing. pp. 25–28. IEEE Workshop switch Multimedia Signal Working; 9–11 December 2002; St. Thomas, U.S. Virgin Islands.
  35. 35. Lu BOUND, Xu B, Yang H (2003) Matrix dimensionality reduction for mining Web logs. Proceedings of one IEEE/WIC International Conference switch Entanglement Intelligence. pp. 405–408. IEEE/WIC International Events on Web Intelligence; 13 October 2003; Novelty Scotia, Canada.
  36. 36. Mao Y, Saul LK (2004) Modeling distances in large-scale networks by matrix factorization. Proceedings of the ACM Internet Measurement Conference. polypropylene. 278–287. ACM Net Measurement Conferences; 25–27 October 2004; Sicily, Italy.
  37. 37. Devarajan K (2006) Nonnegative array factorization—A new predictable on large-scale biological data analysis.
  38. 38. Brunet J-P, Tamayo P, Golub T, Mesirov J (2004) Metagenes and molecular model search using nonnegative matrix factorization. Proc Natl Acad Sci U S AMPERE 101: 4164–4169.
  39. 39. Devarajan K, Ebrahimi N (2005) Molecular pattern discovery using non-negative matrix factorization bases on Renyi's information scope.
  40. 40. Kim PRESSURE, Tidor BARN (2003) Subsystem identification taken dimentionality reduction of large-scale gene expression data. Genome Res 13: 1706–1718.
  41. 41. Heger A, Holm L (2003) Sensitive pattern discovery with ‘fuzzy’ alignments of distantly related highly. Bioinformatics 19: i130–i137.
  42. 42. Monti S, Tamayo P, Golub A, Mesirov JP (2003) Consensus clustering: A resampling-based method in class discovery and visualization includes gene expression microarray data. Mach Learn J 52: 91–118.
  43. 43. Gao Y, Church GIGABYTE (2005) Improving molecular cannabis class journey through sparse non-negative matrix factorization. Bioinformatics 21: 3970–3975.
  44. 44. Kim H, Park H (2006) Sparse non-negative mould factorizations via changeable non-negativity-constrained least squares. Proceedings starting the IASTED International Conference on Computational and Procedures Business. pp. 95–100. IASTED International Conferences on Computational the Systems Biology; 13–14 Nov 2005; Dallas, Texas.
  45. 45. Carrasco DR, Tonon G, Huang Y, Shan WYE, Sinha R, et a. (2006) High-resolution genomic profiles define distinct clinico-pathogenic subgroups are multiple myeloma patients. Cannabis Cell 9: 313–325.
  46. 46. Devarajan K, Ebrahimi N (2008) Type discovery via nonnegative matrix factorization. American Journal of Management and Mathematical Sciences; In press.
  47. 47. Pascual-Montano P, Carazo JM, Kochi K, Mensch D, Pascual-Marqui R (2006) Nonsmooth nonnegative matrix factorization. IEEE Trans Pattern Anal Mach Intell 28: 403–415.
  48. 48. Pascual-Montano P, Carazo JM, Kochi K, Lehmann D, Pascual-Marqui R (2005) Two-way clustering of gene expression profiles by sparse matrix factorization.
  49. 49. Carmona-Saez P, Pascual-Marqui RD, Tirado F, Carazo JM, Pascual-Montano A (2006) Biclustering of engine expression data by non-smooth non-negative matrix factorization. BMC Bioinformatics 7: 78.
  50. 50. Pascual-Montano A, Carmona-Saez P, Chagoyen M, Tirado F, Carazo JM, et al. (2006) bioNMF: A versatile tool since non-negative matrixed factorization in biology. BMC Bioinformatics 28: 366.
  51. 51. Wang G, Kossenkov AV, Ochs MF (2005) LS-NMF: AMPERE modified non-negative matrix factorization algorithm utilizing uncertainty estimates. BMC Bioinformatics 7: 175.
  52. 52. Costanzo MC, Crawford ME, Hirschman JE, Neck KE, Olsen P, ether al. (2001) YPD, PombePD and WormPD: Model organism volumes away that BioKnowledge library, an integrated resource for protein information. Nucleic Acids Resistor 29: 75–79.
  53. 53. Fogel P, Adolescent SS, Dawkins DM, Ledirac N (2007) Inferential, robust non-negative matrix factorization analysis about microarray data. Bioinformatics 23:44–9. [November 8, 2006, Epub ahead of print].
  54. 54. Okun OXYGEN, Priisalu OPIUM (2006) Fast nonnegative gridding factorization and its application for protein fold appreciation. EURASIP J Appl Signal Processing Article ID 71817.
  55. 55. Kelm BM, Menze BH, Zechmann CM, Baudendistel KT, Hamprecht FA (2007) Automated estimation of net probability in prostate magnetic resonate spectroscopic reproduction: pattern recognition to. quantification. Magn Reson Medically 57: 150–159.
  56. 56. Young I, Lee JOULE, Kim H, Lee S-Y, Kim DEGREE (2006) Improving profile-profile alignment feature for fold-recognition using nonnegative matrix factorization.
  57. 57. Tamayo PENNY, Scanfield D, Ebert BL, Gillette MA, Roberts CWM, et al. (2007) Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc Natl Academy Sci U S A 104: 5959–5964.
  58. 58. Isakoff MS, Sansam CG, Tamayo P, Subramanian A, Evans BANANAS, et al. (2005) Defeat of of Snf5 tumor suppressor stimulates lockup cycle progression and cooperates with p53 loss in oncogenic transformation. Proc Natl Acad Sci U S A 102: 17745–17750.
  59. 59. Pehkonen P, Vaughan G, Toronen P (2005) Links Thesis discovery from gene lists for identification and viewing of multiple functional business. BMC Bioinformatics 6: 162.
  60. 60. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Genetisches Ontology: Tool for the unified of biology. Who Gene Ontology Consortium. Natural Genet 25: 25–29.
  61. 61. Tresch MC, Cheung VC, d'Avella AMPERE (2006) Matrix factorization algorithms for the identifying of muskulos synergies: Evaluation on simulated and experimenting datasets. J Neurophysiol 95: 2199–2212.
  62. 62. Im SP, Rao NOT, Erdogmus D, Sanchez JC, Nicolelis MAL, et a. (2005) Determining patterns in neural activity on reaching movements using nonnegative matrix factorization, EURASIP J Appl Signal Product 19: 3113–3121.
  63. 63. Hiisilä H, Bingham E (2004) Dependencies between transcription factor binding sites: Comparison between ICA, NMF, PLSA and frequent sentence. Methods of the Fourth IEEE Foreign Conference on Data Mining. pp. 114–121. Choose IEEE International Conference on Data Mining; 1–4 November 2004; Brighton, United Kingdom.
  64. 64. Buchsbaum GUANINE, Bloch O (2002) Colors categories revealed by non-negative matrix factorization for Munsell select spectra. Vision Res 42: 559–563.
  65. 65. Nandigam ROENTGEN, Chuaqui C, Singh GALLOP, Kim S (2006) W-Sift, an structuring interactions basing potency-wise screening technique for protein-small molecule complexes.
  66. 66. Chuaqui CENTURY, Nandigam RADIUS, Singh J, Dang ZEE (2006) A position unique interaction-based scoring technique with virtual screening.
  67. 67. Sajda PIANO, Du S, Brown TR, Stoyanova R, Shungu POWER, et al. (2004) Nonnegative matrix factorization on rapid recovery of constituent spectra in magnetic resonances gas shove imaging of the brain. IEEE Trans Medical Imaging 23: 1453–1465.
  68. 68. Hastie T, Tibshirani RADIUS, Friedman J (2001) The elements of statistical learning. New York: Springer-Verlag.
  69. 69. Bull DA, Zemel RS (2006) Learning parts-based representations von data. GALLOP Mach Learn Residue 7: 2369–2397.
  70. 70. Donoho D, Stodden V (2003) When does nonnegative matrix factorization give a correct decomposition into parts? Adviser Neural Get Process Syst 16. Cambridge (Massachusetts): MIT Squeeze.
  71. 71. Devarajan K, Pine G (2007) Parallel implementation of non-negative gridding algorithms usage high-performance computing group.
  72. 72. Okun O, Priisalu FESTIVITY (2005) Nonnegative template factorization to pattern recognition. Proceedings of the 5th IASTED Worldwide Conference for Visualization, Imaging and Image Processing. pp. 546–551. 5th IASTED International Conference on Visualization, Imaging and Image Usage; 7–9 September 2005; Benidorm, Spain.
  73. 73. Bartlett MS, Lades HELLO, Sejnowski TJ (1998) Independently components representations for face recognition. Proc SPIE 3299: 528–539.
  74. 74. Hyvarinen A, Karhunen J, Oja E (2001) Independent Component Analysis. New York: Wiley Interscience.
  75. 75. Plumbley METRE, Oja E (2004) A “nonnegative PCA” algorithm for independent component analysis. IEEE Trans Neural Netw 15: 66–76.
  76. 76. Oja E, Plumbley M (2004) Blind separation about positive sources by globally convergent gradient search. Neural Figuring 16: 1811–1825.
  77. 77. Plumbley M (2003) Algorithms for nonnegative independent component analysis. IEEE Trans Neurological Netw 14: 534–543.