Tips for Getting Better Curve Fits
Band Selection
In general, the selection of bands should reflect assumptions about the
probable chemical state of the data and not be chosen purely on the basis of modeling
the experimental data with the synthetic curve. If noise or other factors
play a part in determining the artificial peaks used to model the data, it is best
to prepare the data to remove these factors, either by transformations
(satellite subtraction, smoothing) or by editing the data.
Also, minimizing the number of parameters used will result in a more
predictable fit. Where possible, it is best to use the least complex band type.
Limits
While there are no hard and fast rules for choosing the default limits, keep
in mind that the choice of limits has a great affect on curve fit optimization.
If it is difficult to come up with initial estimates that provide a good
initial fit, it may be best to keep the limits loose at first and tighten them up
later.
Important: The limits for each parameter adjust with the estimate when the estimate is
adjusted manually, and after the curve fit routine is performed. This is why a
second command to perform a curve fit may result in a different (and possible
improved) fit after the curve fit routine has completed. Since the limits track
the estimates, they are “loosened” after each fit. As you get closer to the desired fit, you may wish to tighten
up the limits or adjust the invariance.
Tightening Limits Using Invariance and Locking
Setting the invariance to a higher number is an easy way to tighten the limits
for a particular band for each fit. As stated previously, the limits track the estimate for each fit, but
the invariance stays the same. Setting the invariance higher for a selected
parameter causes that parameter to be adjusted later in the fit, therefore
ensuring that it does not change as dramatically.
Locking a parameter for a fit is essentially the same as setting the limits
equal to the estimate for that parameter. The parameter does not adjust at all
during the fit. This can be especially useful for the position parameter, since
it is often likely that the position of a peak is the most easily
characterized parameter.
Re-fitting Data
It is often the case that the first time a curve fit is performed, there is
convergence before the optimal desired fit is obtained. This is sometimes due to
hitting the limit on the maximum number of iterations, but it is more often
due to tight constraints. In this case, the usual remedy is to click the fit
button again and observe whether the goodness of fit has improved and that the
synthetic curve more closely resembles the expected outcome.
Why do you get a better fit the second time, even though you have not
readjusted any parameters? This is simply due to the fact that after the first fit,
the limits self adjust to track the estimates, in effect, “loosening” them between fits. This is often desirable, since the computer “nudges” the estimates towards convergence.
In some cases, however, you truly want your limits for some parameters to be
fixed for the duration of all the subsequent fits you perform on that data set. In that case, you must be
careful to readjust those limits in between fits. Additionally, you can lock
those parameters for some fits (or increase the invariance), which has the affect
of “tightening” the existing limits.
Correcting Unsatisfactory Fits
The choice of initial estimates greatly determines the direction the curve fit
algorithm will take in adjusting parameters. In some cases, the least-squares
curve fit algorithm may be converging to a local minimum, instead of the
desired solution. One of the easiest ways to combat this is to choose new estimates
on the “opposite” side of the solution. For example, if the initial estimates produce a curve
that lies above the experimental data, adjust the estimates to produce a curve
that lies below the data, and then re-fit.
There are times where re-fitting data multiple times results in some estimates
adjusting far out of expected range. This is often the case with asymmetric
bands, where the tail scale and tail length often increase out of proportion
because of small peaks that haven’t been modeled by additional bands. If this is the case, you may want to
readjust these estimates to expected values, and try to determine what aspect of
the experimental data is not being modeled properly by the existing bands.
Correcting Low-Intensity Fits
One particular problem that often occurs is that the artificial peaks often
appear lower in intensity than the experimental data. There is a specific reason
for this, and there are ways to combat the problem.
The goodness-of-fit value (which is used by the curve fit algorithm to
determine success of a particular fit) is weighted by the intensity. However, lower
intensity portions of a curve, the “wider” area, also contribute to the goodness-of-fit value. This is desirable.
Because of the gaussian curve shape, the lower intensity data covers a greater
range of eV and therefore contributes more to the area. The highest portion of the
peak, while greater intensity, often contributes relatively little. Since all
estimates except position contribute to the area, the total area is an
important factor in the correctness of the fit and obviously important in determining
chemical state and correctness of the results. The algorithm models this
behavior.
However, these types of results – lower intensity synthetic peaks – are often undesirable both from an objective and subjective viewpoint (they
are more obvious than incorrectness at the lower portion of the peak when
viewing a graph). Keep in mind that the problem is probably not with the peak itself
but with the lower intensity data often associated with the background or with
some lower intensity peaks around the peak base. In effect, the algorithm is
trying to widen the lower portion of the curve to correct for this, and does
this by “flattening” out the peaks, therefore widening the bottoms. All parameters except
position contribute to this effect.
To solve this problem, begin by studying the lower intensity experimental
data. Often, some of the background has been left after baseline subtraction and
choosing a new baseline results in removal of this data. Also, there may be one
or more lower intensity peaks that have not been modeled, and additional peaks
may need to be added. Lastly, noise present in the low intensity data often
causes the algorithm to flatten the bands. The data may need to be smoothed, or
in some cases spurious noise can just be manually edited out of the data.
Additional Curve Fitting Topics