Share Your Geology Here

All things about Geology

VARIOGRAM

Overview
This capability was added to Surfer as an integrated data analysis tool. The primary purpose of the variogram modeling subsystem is to assist you in selecting an appropriate variogram model when gridding with the kriging algorithm. Variogram modeling may also be used to quantitatively assess the spatial continuity of data even when the kriging algorithm is not applied.

Variogram modeling is not an easy or straightforward task. The development of an appropriate variogram model for a data set requires the understanding and application of advanced statistical concepts and tools: this is the science of variogram modeling. In addition, the development of an appropriate variogram model for a data set requires knowledge of the tricks, traps, pitfalls, and approximations inherent in fitting a theoretical model to real world data: this is the art of variogram modeling. Skill with the science and the art are both necessary for success. The development of an appropriate variogram model requires numerous correct decisions. These decisions can only be properly addressed with an intimate knowledge of the data at hand, and a competent understanding of the data genesis (i.e. the underlying processes from which the data are drawn). The cardinal rule when modeling variograms is know your data. The variogram is a measure of how quickly things change on the average. The underlying principle is that, on the average, two observations closer together are more similar than two observations farther apart. Because the underlying processes of the data often have preferred orientations, values may change more quickly in one direction than another. As such, the variogram is a function of direction.
The variogram is a three dimensional function. There are two independent variables (the direction q, the separation distance h) and one dependent variable (the variogram value g(q,h)). When the variogram is specified for kriging we give the sill, range, and nugget, but we also specify the anisotropy information. The variogram grid is the way this information is organized inside the program. The variogram (XY plot) is a radial slice (like a piece of pie) from the variogram grid, which can be thought of as a "funnel shaped" surface. This is necessary because it is difficult to draw the three-dimensional surface, let alone try to fit a three dimensional function (model) to it. By taking slices, it is possible to draw and work with the directional experimental variogram in a familiar form - an XY plot. Remember that a particular directional experimental variogram is associated with a direction. The ultimate variogram model must be applicable to all directions. When fitting the model, the user starts with numerous slices, but must ultimately mentally integrate the slices into a final 3D model.

Kriging and Variograms

The kriging algorithm incorporates four essential details:
1. When computing the interpolation weights, the algorithm considers the spacing between the point to be interpolated and the data locations. The algorithm considers the inter-data spacings as well. This allows for declustering.
2. When computing the interpolation weights, the algorithm considers the inherent length scale of the data. For example, the topography in Kansas varies much more slowly in space than does the topography in central Colorado. Consider two observed elevations separated by five miles. In Kansas it would be reasonable to assume a linear variation between these two observations, while in the Colorado Rockies such an assumed linear variation would be unrealistic. The algorithm adjusts the interpolation weights accordingly.
3. When computing the interpolation weights, the algorithm considers the inherent trustworthiness of the data. If the data measurements are exceedingly precise and accurate, the interpolated surface goes through each and every observed value. If the data measurements are suspect, the interpolated surface may not go through an observed value, especially if a particular value is in stark disagreement with neighboring observed values. This is an issue of data repeatability.
4. Natural phenomena are created by physical processes. Often these physical processes have preferred orientations. For example, at the mouth of a river the coarse material settles out fastest, while the finer material takes longer to settle. Thus, the closer one is to the shoreline the coarser the sediments, while the further from the shoreline the finer the sediments. When computing the interpolation weights, the algorithm incorporates this natural anisotropy. When interpolating at a point, an observation 100 meters away but in a direction parallel to the shoreline is more likely to be similar to the value at the interpolation point than is an equidistant observation in a direction perpendicular to the shoreline.
Items two, three, and four all incorporate something about the underlying process from which the observations were taken. The length scale, data repeatability, and anisotropy are not a function of the data locations. These enter into the kriging algorithm via the variogram. The length scale is given by the variogram range (or slope), the data repeatability is specified by the nugget effect, and the anisotropy is given by the anisotropy.

The Variogram Grid

Users familiar with GeoEAS or VarioWinÃ’ should be familiar with pair comparison files [.PCF]. Surfer uses a variogram grid as a fundamental internal data representation, in lieu of a pair comparison file. The pair comparison file can be extremely large for moderately sized data sets. For example, 5000 observations create N(N-1)/2 pairs (12,497,500). Each pair requires 16 bytes of information for a pair comparison file, so a 5000-observation pair comparison file would take approximately 191 megabytes of memory to merely hold the pair comparison information. The time to read and search through this large file makes this approach impractical for many Surfer users.
Computational speed and storage are gained by using the variogram grid approach. Once the variogram grid is built, any experimental variogram can be computed instantaneously. This is independent of the number of observations. However, the ability to carry out on-the-fly editing of variograms on a pair-by-pair basis is lost by using the variogram grid approach in Surfer.
Unlike the grids used elsewhere in Surfer, which are rectangular grids, variogram grids are polar grids. Polar grids cannot be viewed in Surfer, and are only used within the context of variogram computation. The first coordinate in a variogram grid is associated with the polar angle, and the second coordinate is associated with the radial distance out from the origin.
There are eight angular divisions: {0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°} and four radial divisions: {100, 200, 300, 400}. Thus, there are 32 individual cells in this variogram grid. Users familiar with VarioWin® will notice similarities between Surfer's variogram grid and the "variogram surface" in VarioWin® 2.2. In Surfer, only the upper half of the grid is used. See the General Page for a more detailed explanation.
Consider the following three observation locations: {(50,50), (100, 200), and (500,100)}. There are three observations, so there are 3*(3-1)/2 = 3 pairs. The pairs are:
A (50,50), (100,200)
B (50,50), (500,100)
C (100,200), (500,100)
Each pair is placed in a particular cell of the variogram grid based upon the separation distance and separation angle between the two observation locations.
Using the above equations, the separation angle for the first pair of observations {(50,50), (100,200)} is 71.57 degrees and the separation distance is 158.11. This pair is placed in the cell bounded by the 100 circle on the inside, the 200 circle on the outside, the 45° line in the clockwise direction, and the 90° line in the counterclockwise direction. The location of this pair in the variogram grid is shown on the previous page as point A.
Pair
Separation Angle
Separation Distance
A
71.57
158.11
B
6.34
452.77
C
-14.04
412.31
The separation angle and separation distance for each pair
Since the separation distance of pairs B and C are greater than the radius of the largest circle (400), these pairs fall outside of the variogram grid. Pairs B and C are not included in the variogram grid and therefore, not included in the variogram. Using the above equations, every pair is placed into one of the variogram grid cells or it is discarded if the separation distance is too large.
For a large data set there could be millions of pairs (or more) and the associated pair comparison file would be very large. On the other hand, with the variogram grid in the example above there are only 32 grid cells regardless of the number of pairs contained in a particular grid cell. Herein lies the computational saving of the variogram grid approach. It is not necessary that every pair is stored in a variogram grid cell; each variogram grid cell stores only a small set of summary statistics which represent all of the pairs contained within that cell.
Variogram Model
The variogram model mathematically specifies the spatial variability of the data set and the resulting grid file. The interpolation weights, which are applied to data points during the grid node calculations, are direct functions of the variogram model.
NUGGET EFFECT: quantifies the sampling and assaying errors and the short scale variability (i.e. spatial variation occurring at distance closer than the sample spacing).
SCALE (C): is the vertical scale for the structured component of the variogram. Each component of a variogram model has its own scale.
SILL: is the total vertical scale of the variogram (Nugget Effect + Sum of all component Scales). Linear, Logarithmic, and Power variogram models do not have a sill.
LENGTH: is the horizontal range of the variogram. (Some variogram models do not have a length parameter; e.g., the linear model has a slope instead.)
VARIANCE: is the mean squared deviation of each value from the mean value. Variance is indicated by the dashed horizontal line in the diagram shown above.