next up previous
Next: Definition of problem Up: Least-Squares Fitting of a Previous: Least-Squares Fitting of a

Introduction

A recurring computational problem in the field of isotopic studies of terrestrial and extraterrestrial materials has been the interpretation of the observed mass spectra in terms of mixtures of various source components. Each source component is characterized by fixed, but often unknown, isotopic ratios, but it is present in variable amounts in different measured samples. One would like to verify the hypothesis that a given number of components is adequate to account for all observations, and, if possible, not only to determine the source component compositions, but also to resolve each measured sample into its original components, in order to separate different processes in its origin for study. This paper treats only the first steps in this sequence of analysis, i.e., the investigation into the number and compositions of possible source components.

The measured mass spectra may be denoted by $Y_i(\mu)$, where $i=1,
\ldots, n$ is the sample number, and $\mu$ is the mass number. Since $\mu$ takes on only discrete values $\mu_k, k=1, \ldots, p$, each mass spectrum can be represented by a vector in a $p$-dimensional vector space, $Y_{ik} = Y_i(\mu_k)$. These are assumed to be made up of linear combinations of the component spectra,

\begin{displaymath}
Y_i(\mu) = \sum_{j=1}^m \alpha_{ij}g_j(\mu)
\end{displaymath} (1)

Where the $\alpha_{ij}$ are scalars between 0 and 1 subject to the normalization condition $\sum_{j=1}^m \alpha_{ij} = 1, i=1, \ldots,
n$ and the $g_j(\mu)$ are the $m$ different component spectra.

The problem is analogous to that of curve resolution encountered, for example, in chromatography or spectrophotometry, where it has been treated with considerable success using the technique of principal component analysis. Lawton and Silvestre (1971), for example, have considered the case of two source components and have developed a method for computing two bands of curves, each containing one of the source components. The method of principal component analysis, however, runs into difficulties if the data are characterized by widely different experimental uncertainties. This is often the case with mass spectroscopic data. Even if the relative uncertainties in isotopic ratios are similar, the ratios can vary by orders of magnitude. As Anderson (1963) pointed out, the method of principal component analysis is justified only if the ratio of the ``uncertainty'' variance to the ``systematic,'' i.e. correlation, variance is the same for all components of the data. Nonconformity with this requirement may be remedied to some degree by rescaling the data according to their respective uncertainties. Here we abandon the method of principal component analysis for an alternative approach that is on a better statistical footing in that it takes full account of the estimated uncertainties of the data.

It is easily shown that data points consisting of linear combinations of components according to Equation (1) must lie in an $m-1$-dimensional subspace of the full $p$-dimensional vector space. This subspace is defined by the simplex whose vertices are the $m$ distinct components. This paper deals with only the first step in component resolution, namely the determination of the parameters of this subspace. Furthermore, it considers only the simplest case, in which $p = m$, that is, the number of components is the same as the number of coordinates of the space (e.g. the number of isotopic ratios measured in each sample). Thus for 2-dimensional data we seek the equation of a straight line, for 3-dimensional data a plane, and in general a hyperplane of dimension one less than the space in which it is embedded. The general case of arbitrary $m \le p$ is to be dealt with in a future paper.


next up previous
Next: Definition of problem Up: Least-Squares Fitting of a Previous: Least-Squares Fitting of a
Robert Moniot 2002-10-20