HOME GUIDE OPERATIONS DOCS ERRORS FORMATS INSTALL NEW TIPS WEB SITES

HOME GUIDE OPERATIONS DOCS ERRORS FORMATS INSTALL NEW TIPS WEB SITES

Methodology of 2D particle alignment


The approaches to 2D particle alignment can be subdivided into several categories. The main division is created by the availability of a reference image, and the secondary division by the degree of variability within the data set, i.e., in how many orientations the particle is observed to lie in a micrograph.

Types of alignment problems: Back to the beginning

Reference-based alignment

We assume that the reference image is known or that a good approximation of it is available. We expect all the particles to be noisy versions of the reference, with possible small variations. In this case the alignment problem becomes a pattern matching problem. We have to place every particle in an orientation in which it will best match the reference image. In the case of many reference images, in addition, we have to decide which reference is the most similar one. We must also try the mirror orientation since the particle may be flipped.

We use the cross-correlation coefficient to measure the similarity between a particle and a reference. The command that performs reference-based alignment is AP MQ.

The b01.amq batch program implements the basic steps of the reference-based alignment. "Expected size of the object" is an important alignment parameter. It determines the search range for the translation parameters. A small expected size will result in a large search range and a very long computation time.

Advantages of reference-based alignment:

  • It is very fast and robust. Since all the reference images are known, every particle can be matched independently to all of them and the correct assignment can be based on a well-defined similarity measure (the correlation coefficient).

  • The best alignment is found in one pass through the reference images.

  • Results are easily verifiable. Since the reference images are known, it can be easily verified by visual inspection whether the aligned particles are in the proper orientation and how well they match the reference images.


    Disadvantages of reference-based alignment:

  • It relies strongly on the assumption that the particles resemble the reference image. If this assumption is not true, the average of the aligned particles will (for noisy data) look like the reference, and it is difficult to decide whether this similarity is real or is caused by enhanced noise.

  • If exact reference images are not known, it is difficult and time consuming to come up with good approximation of reference.

    Back to the beginning

    Alignment with the reference refinement

    We assume that a set of particles in one orientation is available. Particles are not identical, but they share the same motif. The b03.mar batch program begins with calculation of the global average to approximate the reference, then aligns all the images using the AP MQ command, and calculates new average to obtain improved reference. These steps are iterated prescribed number of times. This program uses one additional procedure: alqr.mar.
    Advantages of alignment with the reference refinement:

  • This procedure is simple, fast, and robust. In case of a near-homogeneous data set one can obtain high-quality alignment.

    Disadvantages of alignment with the reference refinement:

  • The result depends on the first approximation of the reference image. By changing the way the first reference image is created one can obtain different results and it is difficult to determine which one is correct/better.

  • If the first reference image is not a good approximation of the "true" average or if data set contains more than one orientations the results will not be stable.

    Back to the beginning

    Multireference alignment

    We assume that a very large data set is available. It comprises particles in a few distinct orientations. The data set is sufficiently large that at least some of the similar views occur in similar in-plane orientations, and so can be averaged. Thus, if we can approximately center the particles, the subsequent classification step should reveal some of the classes. These classes are used as reference images in the next multireference alignment step, classification is repeated, and new classes are formed. This procedure is iterated until stable classes are obtained.

    Such a multireference alignment is sometimes called alignment through classification. This name reflects the idea that alignment is done separately within groups produced by the classification step.

    The b01.mar batch program implements the multireference alignment. In this procedure search for rotation (done using AP MD) is separated from the search for translation (CC N in procedure alqmd.mar) resulting in a fast, albeit possibly not the most accurate program. It uses the additional procedures:

    alqmd.mar
    centr.mar
    combat.mar

    Another version, b02.mar, uses AP MQ command to do the alignment. This command employs exhaustive search to find rotation and translation simultaneously. In principle it should be more accurate, but it is very slow (particularly for large number of classes). This program uses the additional procedures:

    alqr.mar
    centr.mar
    combat.mar

    Since the multireference alignment is a general idea rather than a detailed algorithm, b01.mar constitutes a particular implementation. It should be considered a blueprint upon which one can build one's own procedure optimized for the particular data set.

  • It is assumed that all the windowed particles are normalized in the same way.

  • The following free parameters have to be decided:

    (a) - radius for alignment and mask -- should correspond to the particle radius;

    (b) - whether classification is done using all pixels within mask in the computation of Euclidean distance, or factors from Principal Component Analysis (PCA);

    (c) - if PCA is to be used, the number of factors has to be set;

    (d) - the number of groups into which the data set will be divided -- this determines the number of class averages that will be obtained;

    (e) - the number of times the procedure should be repeated.

    The steps implemented in b01.mar:

  • 1. All the particles are centered using centr.mar. In this procedure each particle is centered using its own rotational average as a reference, the particle is shifted, its new rotational average is formed and used as a reference, and so on, until no further shift is possible.

  • 2. The particles are classified using k-means clustering. Depending on the flag set either the raw particles are classified or a preset number of factors from PCA is used for classification.

  • 3. Class averages are calculated.

  • 4. Class averages are centered using the CG PH command (phase approximation of the center of gravity).

  • 5. Class averages are rotationally aligned using the AP RA command (reference-free rotational alignment).

  • 6. All the particles are aligned using class averages as reference in the procedure alqmd.mar. Each particle is placed in the orientation of its most similar reference image. The alignment includes rotational alignment, shift alignment, and a check of mirrored orientation. Rotational alignment is done using the AP MD command and is separated from the shift alignment. Shift is corrected using the most similar image (as determined by AP MD) as a reference.

  • 7. Alignment parameters are combined with the alignment parameters obtained in the previous step and a new, aligned image series is formed.

  • 8. Steps 2-7 are repeated a prescribed number of times.

    Advantages of multireference alignment:

  • It is quite powerful. It is possible to obtain stable groups for data with very low signal-to-noise ratio (SNR). It works for data sets containing a mixture of entirely different views (an often-encountered problem, in which side views are, say, rectangular, and top views are circular).

  • The approach is a general idea rather than a "black-box" program; thus, it can be easily modified to the requirements of a particular data set.

  • There are many parameters that can be adjusted to better control the results.

  • Results are easily verifiable. Since the class averages are formed it can be easily verified whether the aligned particles are in the proper orientation and how well they match the averages.


    Disadvantages of multireference alignment:

  • A very large data set is needed. The program depends on the initial orientation of particles, i.e., at least some of the similar views occur in similar in-plane orientations, so that meaningful averages can be formed. Statistically, this can only happen in an adequately large data set. Moreover, these averages should have a sufficiently high SNR to jumpstart the alignment, so they should each contain a sufficient number of particles.

  • The result is somewhat unpredictable. It is impossible in practice to verify whether rare views were revealed as classes or remained misaligned and/or misclassified.

  • Since the approach is a general idea rather than a well-defined procedure, the result will differ depending on the particular implementation. Thus, results obtained by different users/groups are difficult to compare.

  • Even if the general framework is decided upon the large number of crucial free parameters leaves the user with hard choices to make. The results will depend on the values chosen and will differ from one trial to another. The two most difficult choices are the number of clusters and number of factors for PCA. Too few clusters will conceal rare views, while too many will result in large numbers of very similar averages, or else the procedure will fail due to a too-low SNR.

  • The procedure is very slow.

    Back to the beginning

    Rotationally invariant K-means algorithm

    We assume that the particles were centered and we can divide the data set into a specified number of classes. In this case, command AP CA will perform classification and alignment. For each particle the rotation angle as well as the group assignment will be found. For details see corresponding manual chapter. Back to the beginning

    Reference-free alignment

    The rationale of the reference-free alignment is explained in the Introduction to the Reference-Free Alignment Programs. The program will seek such orientations of all the particles in the data set that all the possible pairs of images from this set are in the 'best' relative orientation as determined by the maximum of the CCF.

    The reference-free alignment programs were designed for very noisy data, for particles in many different orientations, and in general for cases in which a reference image is unknown or in which its usage could result in a bias and incorrect results. There are three basic commands in SPIDER that implement this strategy: AP SA, AP RA, and AP SR. AP SA is a shift alignment, AP RA is a rotational alignment, AP SR is a combined shift and rotational alignment. In addition, AP CA performs both classification and rotational alignment for pre-centered data. Unlike previous procedures none of these programs checks mirrored orientations; thus, any mirror-related views will appear as two different orientations. All the alignment commands are "do-all" commands -- they will perform all the necessary operations and store the alignment parameters in the document files. Thus, they can be either used separately or as a part of longer, more elaborate alignment schemes.

  • The b05.rtq batch program uses AP RA to rotationally align an image series and applies parameters stored by AP RA in a document file to rotate all the particles. Subsequently, aligned particles are subjected to PCA and classified using Hierarchical Classification.

  • The b02.sub batch program alternates between AP SA and AP RA to align an image series both translationally and rotationally.

  • The b06.aps batch program uses AP SR to align an image series and applies parameters stored in a document file to rotate and shift all the particles. Subsequently, aligned particles are subjected to PCA and classified using Hierarchical Classification.

    Advantages of reference-free alignment:

  • The command AP SR is very fast and robust.

  • The method has very few free parameters -- essentially only the radius of the particle. The results do not depend significantly on these parameters, and there are no assumptions made about the reference, number of groups, and so on.

    Disadvantages of reference-free alignment:

  • It is difficult to assess how well the particles were aligned. In most practical cases the program gives a nearly-optimum solution, but in some cases (particularly for mixtures of entirely different shapes, but also for very low SNR or very small data sets) it may fail. In these situations one should either use a combination of AP SA and AP RA (with more free parameters, and thus easier to control), or multireference alignment.

    Back to the beginning
    Source: align.html     Last update: 30 Jan 1998    
    © Copyright Notice /       Enquiries: spider@wadsworth.org