SINGLE PARTICLE ALIGNMENT USING SPIDER 
 
                          by JOACHIM FRANK 
 
 
 
 
INTRODUCTION 
 
 
  0.1 General 
 
         This document is intended as a  practical  step-by-step 
    guide  to  all  procedures of single particle averaging.  If 
    there are any omissions, please let me know. 
 
 
  0.2 Bibliographic Information; Reviews 
 
         We keep a bibliographic list containing all  references 
    on single particle averaging. 
 
         Two recent review articles are: 
 
          (1)  J.   Frank  "New  methods   for   averaging 
          non-periodic  objects  and distorted crystals in 
          biologic electron microscopy"  OPTIK  63  (1982) 
          67-89 
 
          (2)   J.    Frank,   A.    Verschoor,   and   T. 
          Wagenknecht  "Processing of Electron-microscopic 
          Images  of   Single   Macromolecules"   in   NEW 
          METHODOLOGIES     IN    STUDIES    OF    PROTEIN 
          CONFORMATION, ed.  T.T.WU, Van Nostrand-Reinold, 
          Inc., New York 1985, pp 36-89. 
    [Albany only: 
 
         There are two manuscripts describing  the  step-by-step 
    procedure with some technical information not normally found 
    in "materials and methods" sections of papers: 
 
          (1)  "Study   of   electron   micrographs   with 
          Correspondence  Analysis"  (in  French)  by  Guy 
          Cave,  describing  the  analysis  of  hemocyanin 
          molecules during his collaborative visit; 
 
          (2) "The 30S ribosomal subunit:  Elucidation  of 
          its  tRNA  binding site via computer analysis of 
          electron  micrographs"  by  Louis  M.   Miranda, 
          describing  an undergraduate research project in 
          1984.] 
 
         General information on  techniques  of  electron  image 
    processing is found in: 
 
 
          (1) D.L.  Misell  "Image  analysis,  enhancement 
          and   interpretation"   Practical   Methods   in 
          Electron Microscopy, Vol.7 (A.M.  Glauert, Ed.), 
          North Holland, Amsterdam 1978, pp.  1-305 
 
          (2) J.  Frank "Computer processing  of  electron 
          micrographs"    in   "Advanced   Techniques   in 
          Biological  Electron   Microscopy"   ed.    J.K. 
          Koehler, Springer Berlin 1973, pp.  215-274. 
 
 
1.  CONSIDERATIONS FOR CHOOSING THE NUMBER OF PARTICLES 
 
         The aim of averaging is reduction of noise.   This  can 
    only   be   achieved   by   averaging  over  a  sufficiently 
    homogeneous set of particle images.  If  the  population  of 
    particle   images   is  heterogeneous,  then  correspondence 
    analysis must be used to identify  sufficiently  homogeneous 
    subsets, which are then separately averaged. 
 
         Images of the 40S ribosomal subunit occurring in the  L 
    view   /1/   exemplify   a   reasonably   homogeneous   set. 
    (Correspondence analysis showed /2/ that a major  factor  of 
    interimage  variation  is the degree of peripheral staining, 
    which does not affect the structure substantially.) Here,  a 
    set  of  40  particles  was  sufficient  to  bring  out  all 
    structure-related details that can be expected  for  stained 
    particles.   In  general,  between 20 and 40 particle images 
    should be available in the homogeneous subset. 
 
               Example:    a    particle    view    occurs 
          essentially  in  three  versions.   If all three 
          versions occur with roughly the  same  frequency 
          in the micrograph, then a total of 60 (= 3 x 20) 
          to 120 (=3 x 40) particles should be included in 
          the analysis. 
 
         However, this calculation assumes  that  all  particles 
    are  of  a quality suitable for inclusion into the averages. 
    In practice, a certain percentage of the particles  (perhaps 
    up  to  20%)  "misbehave":  either they are damaged, or have 
    some unusual stain feature  that  causes  misalignment  with 
    respect to the rest of the set.  These are recognized in the 
    correspondence analysis as "freaks", and are  excluded  from 
    the averaging. 
 
         It is therefore advisable to  scan  a  somewhat  larger 
    number of particles to make up for this anticipated loss. 
 
 
2.  MICRODENSITOMETRY 
 
 
  2.1 Sampling Distance 
 
 
         The sampling distance should be between 1/3 and 1/4  of 
    the  resolution  distance.   For stained macromolecules, the 
    resolution distance cannot be better than 20A.   Hence,  the 
    scanning  distance  should be between Mx5 and Mx7 Angstroms, 
    where M is the magnification at  which  the  micrograph  was 
    recorded.  The final choice of sampling distance is dictated 
    by   the   fixed   aperture   sizes   available    on    the 
    microdensitometer,  since  sampling  and  scanning distances 
    should be matched.  Example:  M = 40,000; then we  look  for 
    an  aperture  size  between  20 (40,000 x 5 x 10[-4]) and 28 
    microns (microns = micrometers).  Since a 25-micron aperture 
    is available, we choose the sampling to be 25 microns, which 
    corresponds to 6.2A on the object scale. 
 
 
  2.2 Scanning Conventions 
 
         For consistency, we stick to the following conventions: 
 
          (1) the micrograph is placed emulsion-side  down 
          on the microdensitometer; 
 
          (2) scanning directions are +X, +Y; 
 
          (3) display (see below) of images always uses  a 
          left-handed coordinate system. 
 
    This way, the display on the graphics  terminal  corresponds 
    to  the  print  of  the micrograph where the same convention 
    (placing the negative emulsion-side down in the enlarger) is 
    used. 
 
 
  2.3 Scanning 
 
         The scanning with the Perkin Elmer microdensitometer is 
    documented on the MICROGRAPH SCANNING SHEET which is kept in 
    the microdensitometer ("micro-D") room.  It is  a  checklist 
    of  steps  to  go through to operate the instrument, and can 
    only be used by those who have already  been  introduced  to 
    the operation. 
 
 
  2.4 Recordkeeping 
 
         For each data tape recorded, a scanning sheet should be 
    filled  out  containing  the  step  size,  file  dimensions, 
    relative location coordinates  (e.g.   location  of  scanned 
    area  on  micrograph), etc.  This must be placed in the FILM 
    SCANNING FOLDER.  It is recommended that  you  keep  a  copy 
    with your records. 
 
         At this time, you should also fill out a PROJECT  SHEET 
    and select a data extension, a three-letter extension used to identify 
    all data generated from the scanned images in the  computer. 
 
 
    To  avoid  duplication,  check  first in the notebook called 
    PROJECT SHEETS before deciding on a data extension.  Fill out the 
    project  sheet  (scanning date, tape number(s), short title, 
    data  extension,  your  initials)  and  file  it  alphabetically. 
    Retain a copy for your records.  (Later, upon completing the 
    project,  fill  in  all  pertinent  information  identifying 
    backups and filmwrites.) 
 
 
  2.5 Tape Formats 
 
         The tape generated by the Perkin Elmer  scanner  has  a 
    nonstandard  format,  is unlabelled (i.e.  not initialized), 
    and is mounted as "FOREIGN" on the VAX.  There are two types 
    of tape data which you should distinguish: 
 
          (i)  Perkin  Elmer   compatible   ("scans"   and 
          "filmwrites").   These  are  generated either by 
          scanning (transfer of data from the Perkin Elmer 
          to  the  VAX)  or by tapewriting using operation 
          'TW' in SPIDER (transfer of data from the VAX to 
          the microdensitometer).  In the latter case, the 
          tape is used to play back images with the aid of 
          the writing option of the microdensitometer. 
 
          (ii) VMS compatible (generated by "COPY" on  the 
          VAX).   These  are  used  to backup all types of 
          files (not only images) created during an  image 
          processing project. 
 
    To avoid confusion, the type should be clearly indicated  on 
    an  attached  label.   For A-tapes (large), yellow seals are 
    available  to  mark  them  as  micro-D.   tapes.    For   B- 
    (middle-sized)  and  C-  (small) tapes, use a yellow sticker 
    instead. 
 
 
3.  TRANSFER OF THE DATA 
 
 
  3.1 General 
 
         If an entire  micrograph  is  evaluated,  the  particle 
    selection  is  in  two  steps:   first  the  entire field is 
    scanned and recorded in a  single  tape  file  (The  biggest 
    field  accepted  is  _____________________).  From this tape 
    file, portions of the  field  are  successively  transferred 
    onto  the  computer  disk.   The  size of these subfields is 
    dictated by the limitations of the display device where  the 
    actual particle selection takes place. 
 
         An  alternative  scanning  strategy   that   is   often 
    convenient  is to scan the micrograph as a "checkerboard" of 
    512 x 512 areas, each as a separate file.  This size permits 
    direct TV or hardcopy display. 
 
 
  3.2 Reading Data from Tape into the Computer 
 
         To read the microdensitometer tape into  the  computer, 
    log in on the computer under your user area and allocate the 
    tape drive ($ ALL MSA0:).  Physically mount the tape on  the 
    tape  drive  and  press "Load" and "On-line" buttons.  Mount 
    the tape logically by 
                          $ MOU/FOR MSA0: 
 
    This makes the tape accessible to the tape reading operation 
    of SPIDER.  [Substitute '0' or '1' for 'MSA0:' if you run on 
    the VAX 11/780.] [The VMS operation system of  the  VAX,  as 
    well  as  the  SPIDER image processing system, is separately 
    covered by introductory material.] 
 
         Now start SPIDER by typing $DRIVER.  Use the data  extension 
    that  you  selected  on the project sheet.  The project extension 
    solicited by DRIVER may or may not be the same as  the  data 
    extension;  basically  it  is  a  tool  to keep command files and 
    RESULTS files of simultaneously running jobs separate.  (See 
    the  USER.DOC  introduction into SPIDER use.) The reading of 
    the tape is accomplished by 'TR'. 
 
 
          EXAMPLE 1:  Interactive tape reading 
 
            TR       
            MSA0:     ;  identify tape unit 
            1         ;  number of tape file 
            W         ;  window option 
            RAW001    ;  name of disk file 
            512,512   ;  size of disk file 
            1,1       ;  top left coordinates of selected 
                      ;    field with respect to field 
                      ;    stored as file on tape 
            2         ;  number of tape file 
               . 
               . 
               . 
              etc. 
 
 
 
    Note that windowing from a given file as well as the reading 
    from different tape files can be in arbitrary order. 
 
         The tape reading for a series of files can be organized 
    by using a DO-loop. 
 
 
          EXAMPLE 2:  Batch reading of successive files 
 
            DO LB1 I=1,10 
            TR 
            MSA0: 
 
 
            X0    ;  running index used as file 
                  ;    number 
            S     ;  i.e. read in entire file 
            RAW00I    ;  running file name according 
                      ;    to value of index I 
            LB1 
            EN 
 
 
 
    To run this DO-loop, the commands listed above must  be  put 
    into   a   SPIDER   batch   command   file   with  the  name 
    B., where  is a two digit  number  and 
     is the currently active 3-letter project extension 
    defined at the beginning of the session. 
          Example:  B23.FIS  (if FIS is the project extension) 
    To execute this batch command stream, simply type  B23  when 
    SPIDER  solicits  the next operation.  SPIDER will then look 
    for the file B23.FIS  and  execute  the  commands  contained 
    within it. 
 
         Similarly, a DO-loop may  be  used  to  window  several 
    fields  from  one large tape file (in this example, the tape 
    file is at least 1024x1024): 
 
 
          EXAMPLE 3:  Batch reading of successive fields 
 
            DO LB1 I=1,4 
            RR X10    ;  read x component of top 
                      ;    left coordinates into 
                      ;    register X10 
            1,512,1,512 
            RR X11    ;  read y component of top 
                      ;    left coordinates into 
                      ;    register X11 
            1,1,512,512 
            TR 
            MSA0: 
            (1)       ;  same tape file throughout 
            W 
            WIN00I 
            (512,512) ;  dimensions 
            X10,X11   ;  top left coordinates 
            LB1 
            EN 
 
 
               NOTE:  The parentheses  around  a  floating 
          point  number  of  pairs of integers are used to 
          indicate repeated  use  of  the  same  number(s) 
          throughout   the   DO-loop.    E.g.,   (512,512) 
          indicates that each of the four  windows  should 
          be created with the same dimensions, 512x512. 
 
 
4.  SELECTION OF PARTICLE IMAGES USING THE INTERACTIVE  PARTICLE 
SELECTION (IPS) PROGRAM 
 
 
  4.1 Invoking the Program and Display Conventions 
 
         The next processing step is the selection of individual 
    particles  from  conveniently sized fields (~512x512) of the 
    micrograph.  This is done by using the graphics terminal for 
    display  and  a  manually  controlled cursor for pointing to 
    suitable particles to be selected.  The  program  organizing 
    this  interactive  selection  is  either  self standing (VAX 
    11/780) or on the VAX 11/750 it can be invoked as an  option 
    of the 'TV' command of SPIDER 
 
                        .Operation:  TV IPS 
 
    A menu appears on the screen.  To display an image stored in 
    file  RAW001, type D RAW001.  The image is re-scaled to give 
    maximum contrast on the screen.  High optical density values 
    are displayed in white, low values in black.  Therefore, the 
    picture on the screen appears with same polarity of contrast 
    as a print made from the micrograph film or plate. 
 
         The display always starts at the top left corner of the 
    image.  Each scanned line produces a horizontal line running 
    from left to right on the screen, resulting in a left-handed 
    coordinate  system,  no matter what the coordinate system of 
    the scanning was.  With the scanning conventions  introduced 
    before,  the  picture  on  the  screen appears with the same 
    handedness as a print made from the micrograph. 
 
 
  4.2 Particle Selection 
 
         The use of the IPS program is separtely documented. 
 
         Essentially, the 'W' command  is  used  to  invoke  the 
    windowing  part  of  the  program,  and  to define the first 
    destination file in a series of  files  to  be  created.   A 
    document file needs to be specified to receive the windowing 
    coordinates of each file.  [A document file is  a  universal 
    vehicle for transferring keyed values of parameters from one 
    SPIDER session to another.  In our case, the key is used  to 
    identify  the particle (or window) number, and the values to 
    be saved are the top left coordinates and the dimensions  of 
    the  window selected.] This information can later be used in 
    a batch file to reselect the particles without going through 
    the interactive selection again. 
 
         The particles selected form a file series with a common 
    prefix and a common data extension, e.g. 
 
                  WIN001, WIN002, ..., WIN082.BOU 
 
 
    Many operations in SPIDER allow individual files of the file 
    series  to  be referred to by implicit DO-loops.  The prefix 
    is specified, and the file numbers are specified in the form 
    2-8,13,17-21  etc.  Examples are MN (Montage; see below) and 
    AS (Add/compute statistics). 
 
 
5.  PREPROCESSING 
 
 
  5.1 Montage Operation 
 
         The first action after  selecting  the  images  is  the 
    inspection  of  the  entire  gallery.  There are two ways of 
    doing this: 
 
          (i) While  still  in  the  interactive  particle 
          selection  program,  erase  the  entire field by 
          typing 'E', then use the multiple display option 
 
                         .TV:  D WIN001-040 
 
          This  displays  the  image  series  in  columns, 
          following the same organization of the screen as 
          with  repeated  use  of  the  'D'   option   for 
          different images. 
 
          (ii) Put the image  series  into  a  montage  by 
          using  operation 'MN'.  This operation creates a 
          single picture  containing  the  images  of  the 
          series  organized by rows.  Optionally, a margin 
          can be specified.  The montage may be  displayed 
          on   various  devices  using  any  one  of  four 
          operations:  'TV' 'TV IPS' 'GS' and 'TW'. 
 
 
          EXAMPLE 4:  Put a file series into a montage 
 
 
            MN        ;  operation montage 
            WIN       ;  prefix of file series 
            1-82      ;  file numbers to be used 
            6,5       ;  number of images per row 
                      ;    and margin size (pixels) 
            0.5       ;  background constant 
            MON001    ;  name of output file 
 
 
 
    Note  that  this  operation  inserts  the   images   without 
    rescaling.   If  images with widely different density ranges 
    are to be put into the  same  montage,  use  'MN  S',  which 
    rescales  each individual image into the range 0-2, assuming 
    maximum contrast.  In this case, you have to  be  sure  that 
    the background constant is within the density range 0-2. 
 
 
  5.2 General Image  Display  Facilities--Monitor  and  Hardcopy 
  Options 
 
         (a) TV IPS and its option 'D' may be  used  to  display 
    any image. 
 
         (b) TV is the normal display operation.  It  allows  no 
    manipulation  of  the  image.   Each  new image is displayed 
    below or alongside of images already residing on the screen, 
    allowing  multi-user  access  to the same screen.  Both 'TV' 
    and 'TV IPS' images on the screen may be preserved by use of 
    the Tektronix hardcopy device. 
 
 
          EXAMPLE 5:   Use  of  TV  operation  to  get 
          standard black and white display 
 
            TV 
            MON001    ;  file to be displayed 
            1,0       ;  (=) size factor and 
                      ;    color change option 
 
 
 
 
         The above example generates a standard black and  white 
    image  on  the  monitor.   Instead  of  the  black and white 
    (halftone) values/levels, colors  can  be  assigned  to  the 
    pixel  values.   Due to the increased sensitivity of the eye 
    to color changes, a much larger dynamic range of visual data 
    can be perceived. 
 
         Since  the  colors  assigned  to  the  image   do   not 
    correspond to any physical attribute of the object depicted, 
    these displays are called "false color displays". 
 
         The range of colors assigned to  the  pixel  values  is 
    defined  by  the color lookup table.  This is a table of 255 
    sets of three numbers, one each for red,  green,  and  blue. 
    Each  number  ranges  from  0  to  255,  and  indicates what 
    proportion of the particular color should be mixed with  the 
    two other primary colors to generate the desired hue. 
 
         Forty-seven color tables  are  implemented.   They  are 
    documented  in  a  folder  marked  with red, green, and blue 
    tape.  They have trivial names referring  to  their  origin; 
    e.g.    UCLA,   UT1  (University  of  Texas),  GE2  (General 
    Electric). 
 
         In order to switch the display of  all  images  already 
    existing  on  the  screen to one of the color tables, use TV 
    with an escape for the image name: 
 
 
          EXAMPLE 6:  Change color table 
 
 
            TV 
            * 
            METHEA7 
 
 
 
    In this example,  the  color  lookup  is  changed  from  the 
    previously assigned table to the table called 'METHEA7'. 
 
         (c) (VAX  11/780  only)  GS  (Grey  scale)  produces  a 
    VERSATEC hardcopy of the image.  This is very convenient for 
    recordkeeping.  (Note that it is completely  independent  of 
    monitor display.) 
 
 
          EXAMPLE 7:  Use of VERSATEC to  display  two 
          montages 
 
            GS        ;  Versatec operation 
            2         ;  number of images 
            MON001    ;  first input file 
            MON002    ;  second input file 
            1         ;  size factor 
 
 
 
 
         (d) TW (Tape write)is the reverse of the TR (Tape read) 
    operation.   The  picture  stored on computer/on disk is put 
    onto  tape  in  a  format  that  is  compatible   with   the 
    microdensitometer,   and   can   be  'played  back'  onto  a 
    photgraphic negative as a 'film write'.  This  procedure  is 
    only  used  to  produce  high-quality representations of the 
    image for publication or documentation. 
 
         The TW operation puts a frame around the image and adds 
    a   greyscale  at  the  top  and  a  label  at  the  bottom. 
    Therefore, the number of lines of the tape-written image  is 
    always  larger than the number of lines (NROW) of the actual 
    image stored on disk.  Since normally several  pictures  are 
    filmwritten  onto  the  same  sheet of film, the exact final 
    dimensions of  each  picture  are  needed  to  organize  the 
    filmwriting.   This  information is provided by TW after the 
    operation is finished  and  appears  as  a  message  on  the 
    screen. 
 
 
          EXAMPLE 8:  Tape-write an image 
 
 
            TW        ;  tape write operation 
            MON001    ;  file name 
            3         ;  tape file number 
            FL        ;  option to use flip scan 
               .      ;    format and add a label 
               .      ;    underneath 
               .      ;  continue with next file 
               .      ;    number or terminate 
               .      ;    with '*' 
 
 
 
 
         Note that the flip scan format is most practical  since 
    it   is   compatible   with  a  mode  of  operation  of  the 
    microdensitometer where lines are written in both travelling 
    directions.   This  is  also the normal mode of the scanning 
    operation. 
 
 
  5.3 Masking 
 
         For the next processing step, that of alignment of  the 
    windowed  images, it is useful (though not always necessary) 
    to mask the particle images, allowing only pixels  belonging 
    to   the   particle   and   its  immediate  surroundings  to 
    participate in the  alignment.   Otherwise,  exterior  stain 
    clumps  and neighboring particles could bias the orientation 
    and translation search.  In  addition,  the  elimination  of 
    carbon  film  structure  ("structural noise") outside of the 
    particle  increases  the  signal  to  noise  ratio  of   the 
    correlation  detection,  and  helps to boost the correlation 
    peak height in the case of faintly visible  particles  (e.g. 
    in minimum dose or low dose micrographs). 
 
         There are two types of masks,  suitable  for  different 
    steps  of the single particle alignment procedure:  circular 
    masks, and masks tailored to the particle shape.  Only masks 
    in  the first category are used in pre-alignment processing, 
    since  tailored  masks  may  introduce  a  bias   into   the 
    determination of angles and shifts. 
 
         The circular masking procedure provides the options  of 
    having a sharp cutoff or a smooth cutoff.  The smooth cutoff 
    is realized by a piecewise gaussian  function  which  slowly 
    reduces  the  contrast at the edge of the mask as a function 
    of radius, without affecting the value of the  mean  density 
    of  the  image.   The  so-called  halfwidth  of the gaussian 
    function determines the  range  within  which  the  contrast 
    falls to 1/e = 37% of its full value. 
 
 
          EXAMPLE 9:  Use  of  a  circular  mask  with 
          gaussian falloff 
 
 
            MA        ;  circular mask operation 
            WIN001    ;  input file 
            MAS001    ;  output file 
            12.5      ;  outside radius 
            G         ;  gaussian option selected 
            P         ;  precise average of image 
                      ;    area after masking to 
                      ;    be used as background 
            33,33     ;  mask center coordinates 
            3.        ;  halfwidth 
 
 
                               NOTES: 
 
               a) Naming Conventions.  It is a  good  idea 
          to use names that refer to the type of image, or 
          to its role in the entire  project.   This  will 
          later  make it much easier to make sense of your 
          documentation in the project notebook.   In  the 
          above  example,  WIN  is the prefix of the input 
          file set, MAS is that of the output file set. 
 
               b) Mask Center Coordinates.  For a  picture 
          with   dimensions   (NSAM,NROW)  that  are  even 
          numbers, the true center  does  not  fall  on  a 
          pixel.   By convention, the sample center is put 
          at (NSAM/2 + 1, NROW/2 + 1).  For (64,64) files, 
          the   sample   center   would   be  at  (33,33). 
          Operation  RT  (rotate)  uses  this  center   as 
          rotation center. 
 
 
         In order to apply the masking operation to a series  of 
    files, we must set up a batch file: 
 
 
          EXAMPLE 10:  Batch masking of particles 
 
            ;B04.PRJ  1/9/85  Mask hemocyanin top views 
            DO LB1 I=1,82 
            MA 
            WIN00I    ;  input file 
            MAS00I    ;  output file 
            (12.5)    ;  outside radius 
            G         ;  gaussian option 
            P         ;  use precise average 
            (33,33)   ;  mask center coordinates 
            (3.)      ;  halfwidth 
            LB1       ;  DO loop goes to here 
            EN        ;  end 
 
 
 
 
         At  the  end  of  this  batch  run,  the  masked  files 
    MAS001...MAS082 exist on disk. 
 
 
6.  ALIGNMENT 
 
  6.1 Philosophy of Single Particle Alignment 
 
         The hand-selected particles stored in the  file  series 
    WIN00I  are  only roughly centered, and may therefore appear 
    in  different  orientations.   The  aim  of  the   alignment 
    procedure is to put all particles into the same position and 
    orientation with respect to the image  frame,  so  that  any 
    pixel  (k,i)  corresponds  to the same point of the molecule 
    projection. 
 
 
    6.1.1 Translational Alignment 
 
         Translational alignment of particles occurring  in  the 
    same   orientation   is   achieved  with  the  help  of  the 
    cross-correlation operation (CC). 
 
 
          EXAMPLE 11:  Find the shift between two images 
 
            CP        ;  Copy first image into  
                      ;    working file 
            MAS001    ;  input file 
            CCF001    ;  output file 
            CC IC     ;  Cross-correlate in-core 
            CCF001    ;  first image, to be  
                      ;    overwritten by the CCF 
            MAS002    ;  second image, to be  
                      ;    overwritten by its Fourier 
                      ;    transform 
            N         ;  no filtration requested 
            PK X10,X11;  peak search 
            CCF001    ;  cross-correlation function 
            3         ;  number of peaks to be searched for 
 
 
                               NOTES: 
    This is a sequence  of  three  operations:   (a)  copy,  (b) 
    cross-correlate, and (c) peak search. 
 
         (a) Copying the first  file  into  a  working  file  is 
    necessary  to  protect  it  from  being destroyed.  For, the 
    operation  CC  overwrites  its  first  input  file  by   the 
    cross-correlation  function  (CCF).   Anticipating  that the 
    working file will eventually contain the CCF, we give it the 
    name CCF001 ahead of time. 
 
         (b) The CC operation creates the CCF, and it overwrites 
    the second image by its Fourier transform. 
         [This  happens   because   the   computation   of   the 
    cross-correlation   makes   use   of   a   Fourier  theorem: 
    symbolically, it can be expressed in the form 
 
                   CCF = F[-1] {F {p1} x F* {p2}} 

                               where 
              F = Fourier transformation 
              F[-1] = inverse Fourier transformation 
              F* = Fourier transformation followed 
                  by complex conjugation 
              p1,p2 = images #1 and #2 
 
    At the stage where the product F{p1} F* {p2}  is  formed,  a 
    filter  function  may be applied.  In the above example, the 
    input parameter 'N' requests that this option should not  be 
    used.] 
 
         [Application of a filter  function  at  this  point  is 
    equivalent  to  applying operation 'FF' to one of the images 
    prior to 'CC' execution.   The  purpose  of  the  filtration 
    would  be the enhancement of the s/n (signal-to-noise) ratio 
    in the CCF, by suppression of noise at  spatial  frequencies 
    higher  than  the  resolution  limit.  This is needed if the 
    micrographs  have  a  very  low  contrast,  e.g.    low-dose 
    micrographs.] 
 
         The CCF is a two-dimensional function that has the same 
    format  as  the  image,  and  can  be  displayed by the 'TV' 
    operation.  The origin of the function  is  at  (NSAM/2  +1, 
    NROW/2  +  1).  A peak at this point would indicate that the 
    two  images  are  maximally  correlated  in  the   unshifted 
    position.  If the peak occurred at (NSAM/2 + 1 + KSH, NROW/2 
    + 1 + ISH), this would mean that the images are shifted with 
    respect to each other by (KSH, ISH). 
 
         (c) The exact position of the peak in the CCF  relative 
    to the point (NSAM/2 + 1, NROW/2 + 1) is determined by 'PK'. 
    This is a general peak search operation, and for the present 
    application  no  more  than  three  highest peaks need to be 
    searched for. 
 
         [Why 3 peaks and not just  the  highest?   Because  the 
    list  of  the  3 highest peaks provides a good check whether 
    the highest peak is indeed a detection peak or just a  noise 
    fluctuation.   If  it is a detection peak, it will stand out 
    from peaks #2 and #3 which will both be in the same order of 
    magnitude.  If it is a noise peak, then all three peaks will 
    be of the same order of magnitude.  This list appears in the 
    file RESULTS.PRJ.] 
 
         The peak position with respect to the center of the CCF 
    is  returned  as a pair of floating point numbers in the two 
    registers  specified,  in  this  example   X10,X11.    These 
    registers  are used to transfer the shift components from PK 
    to a subsequent SH (SHift) operation.  [Any register between 
    X10  and  X99  may  be  used  for transfer of values between 
    different operations.] 
 
         PK determines the exact (non-integer) peak position  by 
    a  nine-point  parabolic fit.  To realize this accuracy, the 
    "floating-point shift" operation SH F should be used  rather 
    than  the  single SH operation.  SH F computes each pixel of 
    the shifted  image  in  arbitrary  non-integer  position  by 
    bilinear interpolation. 
 
 
          EXAMPLE 12:  Shift one image by  a  vector  previously 
          found by CC, PK 
 
            SH F      ;  Shift with floating-point  
                      ;   option 
            MAS001    ;  Input file 
            SHI001    ;  Shifted output file 
            -X10,-X11 ;  Negative shift vector  
                      ;   components contained in 
                      ;   registers from the PK 
                      ;   operation 
 
 
 
         [Whether or not the peak position vector  needs  to  be 
    inverted  in  sign  depends  on  the  order in which the two 
    images entered the CC operation.  As a rule, the first image 
    entering  the  operation (= the one to be overwritten by its 
    CCF) has to have the sign of the position vector inverted in 
    a  subsequent  shift,  as  in the above example.  The second 
    image entering the CC operation is considered the  reference 
    image.  It is overwritten by its Fourier transform, in which 
    form it can  serve  as  a  reference  in  any  subseqent  CC 
    operations on a file series.] 
 
         After the above shift operation,  the  two  images  are 
    perfectly aligned.  Remember that we assumed that the images 
    were correctly oriented to begin with.  How this is achieved 
    will be explained in the next section. 
 
 
    6.1.2 Orientation Alignment 
 
         Before the particles can  be  translationally  aligned, 
    they  must  be brought into the same orientation.  This must 
    be done by a procedure that does not depend on the  relative 
    translational  positions of the particles because we have no 
    way of  correcting  these  without  first  establishing  the 
    orientation. 
 
         Therefore we make use of the autocorrelation  functions 
    (ACF)  of the images.  The ACF of an image is the CCF of the 
    image with itself.  It  is  a  two-dimensional,  displayable 
    function which has the following properties: 
           (i) it has a high peak at its center (at NSAM/2 + 
        1, NROW/2 + 1) 
           (ii) it is invariant  to  a  translation  of  the 
        image 
           (iii) an off-center peak at  (XP,YP)  means  that 
        the  vector  (XP,YP)  or  the  vector  (-XP,-YP)  is 
        prominent in the image 
           (iv)  it  is  centrosymmetric;  i.e.   any   peak 
        encountered  at (XP,YP) is always accompanied by its 
        twin at (-XP,-YP). 
 
         What does  it  mean  to  say  "the  vector  (XP,YP)  is 
    prominent  in  the  image"?   An example would be a particle 
    consisting of two  defined  density  blobs  separated  by  a 
    defined  distance D.  The relative position of the blobs can 
    be characterized by two parameters:  length and  orientation 
    of the inter-blob vector. 
 
         Both parameters can be determined from the  ACF:   from 
    the  radial  position  of the peaks occurring at (XP,YP) and 
    from the angle of the radius vector  connecting  the  origin 
    with the peak. 
 
         If the particle consisting of the two blobs rotates  by 
    a  certain  angle      (phi), the peak in the ACF rotates by 
    the same angle in the same direction. 
 
         Hence, the relative orientation of two  such  particles 
    can  be  found by comparing their ACFs with one another.  In 
    the computer,  this  is  done  by  an  operation  called  OR 
    (ORient),  for  orientation search.  This operation compares 
    two-dimensional patterns and determines  for  which  rotaion 
    angle  they  come  into  maximum  overlap.   Therefore,  the 
    orientation between two particles stored in  images  PIC001, 
    PIC002  can  be  determined  by  comparing  their  ACFs  and 
    matching these by OR. 
 
 
          EXAMPLE 13:  Compute the auto-correlation function  of 
          an image 
 
            CP        ;  Save your image first 
            MAS001    ;  File to be saved 
            ACF001    ;  Copy of file to be auto- 
                      ;   correlated 
            AC IC     ;  Perform auto-correlation 
                      ;   in-core 
            ACF001    ;  Input file; to be over- 
                      ;   written by result 
            N         ;  No filtration needed 
 
 
 
 
 
               NOTE:  Filtration may be used, in the  same  way, 
          and  for  the  same  reason,  as in 'CC', to boost the 
          signal/noise ratio of  the  ACF  by  suppressing  high 
          spatial  frequencies.   This is necessary only if both 
          images are very noisy. 
 
         In the same  way,  the  ACF  of  the  second  image  is 
    computed,  resulting in ACF002.  The following example shows 
    the orientation search procedure. 
 
 
          EXAMPLE 14:  Determine orientations by  comparing  two 
          ACFs 
 
            OR X12    ;  Search orientation and put 
                      ;   resulting angle into X12 
            ACF001    ;  First ACF 
            ACF002    ;  Second ACF 
            10,0      ;  10=number of rings to be used 
                      ;   0=use default assignments of  
                      ;   ring radii and weights 
            N         ;  No orientation curve for each 
                      ;   ring to be printed 
 
 
 
         The operation segments  the  two  functions  into  ring 
    zones.   Data  along  pairs  of corresponding ring zones are 
    compared (or correlated) to find the best match.  The result 
    of  the comparison is a so-called orientation curve that can 
    be optionally printed for each ring (option "N" in the above 
    example  would be changed to "Y").  The orientation angle is 
    found by summing all curves and searching for the angle fore 
    which the sum curve is maximum. 
 
         [The default assignment for ring radii and  weights  is 
    as follows: 
 
                    R(I) = NSAM x 2/3 X 1/N X I 
                  I = 1...N; W(I) = 1. everywhere 
 
    With this assignment, difference vectors  contained  in  the 
    image  with  the  maximum  length  2/3  x  NSAM  are able to 
    contribute to the orientation alignment.] 
 
                        Alignment Ambiguity 
                        =================== 
 
         Since the ACF is centrosymmetric, an angle CP  returned 
    from  the  ACF orientation search may mean that the original 
    images are rotated by CP or that they  are  rotated  by    + 
    180.   A  visual  check resolves this ambiguity immediately. 
    However, since we are normally dealing with a  large  number 
    of  images, we need an automated check.  This is provided by 
    the subsequent CCF computation:  Unless the particle  itself 
    possesses  centrosymmetry,  the  CCF  should be computed for 
    both orientations of the  image,      and      +  180.   The 
    angle for which the CCF peak is maximum is the correct angle 
    to be used for alignment. 
 
 
    6.1.3 Padding 
 
         The CCF and ACF computations as shown are correct  only 
    for  images  in  which  the particle diameter (including the 
    peripheral stain well) measures less  than  half  the  image 
    dimension  NSAM/2.   If  the  particle  is larger, the image 
    needs to be padded into a larger field prior to the CC or AC 
    operation.   The operation CP (CoPy) used in examples 12 and 
    14 is replaced by PD (PaD): 
 
 
          EXAMPLE  15:   Pad  an  image  as  a  preparation  for 
          correlation 
 
            PD        ;  Pad operation 
            MAS001    ;  Image to be padded 
            PAD001    ;  Output image 
            128,128   ;  Dimensions 
            Y         ;  Yes, we want the average 
                      ;   used as background 
            33,33     ;  Top left coordinates with 
                      ;   respect to output image 
 
 
 
         Note that the dimensions of  the  new  file  should  be 
    powers of two; i.e., the next larger powers of two that will 
    fulfill  the  condition  formulated  above.   The  top  left 
    coordinates   can   be  chosed  arbitrarily,  if  only  done 
    consistently for a given alignment run. 
 
         [In the  above  example  #15,  (33,33)  was  used  only 
    because  the output image looks more pleasing aesthetically, 
    as it contains the input image in a central position.] 
 
 
  6.2 The SPIDER Alignment Procedures 
 
       All steps of alignment described in the preceding section 
  are  realized  by  SPIDER  alignment procedures.  These differ 
  only in the following aspects: 
           (i) image dimensions 
           (ii) copy or pad 
           (iii) ambiguity checks 
           (iv) use of the document file 
    [A pad into a power-of-two dimensioned  image  allows  input 
    images of arbitrary size to be used.  The ambiguity check is 
    not needed if the particle is centrosymmetric.] 
 
         Before giving an example, the general concept of SPIDER 
    procedures  and  their  use  will  be described.  If you are 
    already familiar with this, skip the following sections  and 
    continue with 6.2.3. 
 
 
    6.2.1 Concept of a SPIDER procedure 
 
         A procedure is a  sequence  of  SPIDER  operations  for 
    which some of the input parameters (e.g.  file names, window 
    dimensions, rotation  angles)  are  specified  at  execution 
    time.   Procedure commands and normal operation commands can 
    be freely mixed. 
 
 
          EXAMPLE 16:  A simple SPIDER procedure:  Pad and  mask 
          a 64x64 file 
 
            ;PM1 1/27/85  :  Pad and mask an image 
            PD                     ; Pad operation 
            ?image to be padded?   ; query soliciting 
                                   ;   the input 
            SCR999                 ; Scratch file 
                                   ;  intermediate 
            128,128                ; Dimensions of 
                                   ;  padded file 
            Y                      ; Yes, use average 
                                   ;  for background 
            33,33                  ; Place into center 
            MA                     ; Mask operation 
            SCR999                 ; Temporary file 
                                   ;  from above 
            ?output file?          ; query soliciting 
                                   ;  the output 
            ?mask radius?          ; query soliciting 
                                   ;  outside mask radius 
            G                      ; Gaussian option 
                                   ;  selected 
            P                      ; Precise average to 
                                   ;  be used 
            65,65                  ; Mask center 
                                   ;  coordinates 
            RE                     ; Return 
 
 
 
         To make this procedure available, this command sequence 
    has  to  be  typed  into  a  file  PM1. where 
     is the three-letter extension introduced  earlier. 
    The  general form of a procedure name is  where A,B are 
    alphabetic characters and N is a digit. 
 
         Recently (May, 1985), more general procedure names have 
    been made available.  They can take the form of 
 
                               
 
    where ABCDE is a string of up to 8 letters or  digits,  e.g. 
    ALIGN10  or  ORIENT64.   The  corresponding  files  would be 
    ALIGN10.   and   ORIENT64., 
    respectively.   These  procedures  are invoked by @, 
    e.g.  @ALIGN10. 
 
                          Interactive Use 
                          =============== 
 
         To use the  procedure,  type  PM1.   The  queries  will 
    appear  in  the  same order as in the procedure listing, and 
    will wait for your answer: 
 
 
          EXAMPLE 17:  Use procedure interactively 
 
            PM1 
            ?image to be padded? WIN005 
            ?output file? PAD005 
            ?mask radius? 15. 
 
 
 
         When the procedure run is complete, SPIDER returns with 
    'OPERATION:'.   To  see  that the actual processing sequence 
    performed, you can list the  results  file  RESULTS.. 
 
                             Batch Use 
                             ========= 
 
         The procedure may be called from a batch command  file. 
    As  an  example,  we  show  the  call  to the procedure in a 
    DO-Loop: 
 
 
          EXAMPLE 18:  Call procedure from batch 
 
            ;B03  1/27/85 
            DO LB1 I=1,75  ; DO-Loop over all 
                           ;  particles 
            PM1            ; Procedure call 
            WIN00I         ; Input file 
            PAD00I         ; Output file 
            (15.)          ; Mask radius 
            LB1            ; DO-Loop label 
            EN             ; 'ENd' command 
 
 
    6.2.2 Standard SPIDER Procedures:  Use and Documentation 
 
         Standard procedures available  in  the  system  library 
    have  the extension '.SYS'.  They are called by default if a 
    procedure is invoked, and no file with  that  name  and  the 
    session's  project  extension as extension is found in the user's 
    own directory. 
 
         Example:  User calls AL8 and has prepared  the  command 
    file AL8.FIC.  If he is running SPIDER with project extension FIC 
    he will be using his own procedure AL8.FIC.  If  he  runs  a 
    project extension that is different, or if he has not written his 
    own procedure file, the system will default  to  AL8.SYS  in 
    the system's directory. 
 
         The standard procedures are  listed  in  the  procedure 
    documentation:   For  each  procedure,  a command listing is 
    given and the interactive use  is  described  in  a  similar 
    manner  as  the  use  of  SPIDER  operations.   [The command 
    'SPHELP'  (outside  of  SPIDER)  allows   access   to   this 
    documentation.] 
 
 
    6.2.3 Use of Alignment Procedures 
 
         Alignment procedures come in two steps: 
 
         (i) The reference image and files derived from this are 
    prepared by an initialization at the beginning of the batch; 
    (ii) the actual alignment of an image set  with  respect  to 
    the  reference  image is done in a DO-loop.  In our example, 
    AL2 is used for initialization, AL8 for the alignment. 
 
 
          EXAMPLE 19:  Use of alignment procedure 
 
            IL5            ;  Initialize 
            MAS007         ;  Reference image 
            DO LB1 I=1,80  ;  DO-loop over 80 particles 
            AL8            ;  Alignment procedure 
            MAS00I         ;  Image to be aligned 
            X0             ;  DO-loop count 
            OUT00I         ;  Aligned file 
            DOC001         ;  Document file 
            LB1            ;  DO-loop label 
            EN             ;  'End' command 
 
 
 
         The alignment procedures used above, IL5 and AL8.   are 
    for  the  general  case of particles without centrosymmetry; 
    i.e.  they contain a 180-degree check  (see  6.1.2).   Since 
    this   check   is  relatively  time-consuming,  the  simpler 
    procedures ...   and  ...   should  be  used  for  particles 
    possessing  centrosymmetry.   IL5.SYS and AL8.SYS are listed 
    in the following. 
 
      
    ; ### IL5 7/5/78 ### ALIGNMENT INITIALIZATION FOR DL5  
    PD  ### PAD INPUT INTO REF ### 
    ?REFERENCE IMAGE? 
    REF999 
    128,128 
    Y 
    33,33 
    RT 
    P1 
    SCR999 
    (180.) 
    PD 
    SCR999 
    REF888 
    (128,128) 
    Y 
    (33,33) 
    DE 
    SCR999 
    CP                  ;  ### COPY REF INTO ACF ### 
    REF999 
    ACF999 
    AC          ;  ### AUTO-CORRELATE ACF ### 
    ACF999 
    N 
    RE 
 
 
    ; AL8.SYS : ACF ALIGN W/ 180 DECISION.  TO PRECEDE SL1.SYS  
    ;   DIRECT CYCLE USE IL5.SYS FOR INITIALIZATION 
    PD                  ; Pad image into 128x128 field 
    ?IMAGE TO BE ALIGNED?       ; Image to be padded 
    TMP002                      ; File containing large image 
    (128,128)           ; Dimensions of large image 
    Y                   ; Yes--use average of input image for padding 
    (25,25)                     ; Top left coordinates of image 
    RR X50                      ; Read register 
    ?PARTICLE NUMBER? ; Particle number to be stored in register--used as key in document file 
    AC  IC                      ; Compute ACF in-core 
    TMP002                      ; File to be overwritten by its ACF 
    N                   ; No filtration requested 
    OR X10,X90          ; Search orientation, put angle into X10 
    ACF999                      ; ACF of reference particle 
    TMP002                      ; ACF of current particle 
    (10,0)                      ; Use 10 rings, 0 = use default radii and weights 
    N                   ; No orientation curve to be plotted for each ring 
    RT                  ; Rotate 
    P1                  ; Image to be rotated = first image entered above 
    COP899                      ; Rotated image 
    -X10                        ; Use negative angle from Orient above 
    PD                  ; Pad rotated image 
    COP899                      ; Rotated image 
    PAD999                      ; File containing large image 
    (128,128)           ; Dimensions of large image 
    Y                   ; Yes--use average of inpur image for padding 
    (25,25)                     ; Top left coordinates of image 
    CP  ; SAVE PADDED FILE FOR SECOND CC 
    PAD999 
    PAD888 
    CC IC                       ; Cross-correlate in core, TRY 'UP' POSITION 

    PAD999 
    REF999                      ; Padded reference 
    N                   ; No filtration requested 
    PK X11,X12,X13,X14,X15,X16 ; Peak search CCF in 'up' position 
    PAD999 
    (3,0)               ; 3 peaks to be searched, 0=no default origin override 
    CC IC               ; Cross-correlate in-core, NOW TRY 'DOWN' POSITION 
    PAD888 
    REF888 
    N 
    PK X21,X22,X23,X24,X25,X26  ; Peak-search CCF in 'down' position 
    PAD888 
    (3,0) 
    X89=1 
    IF(X23.LT.X13)GOTO LB1 
    X15=-X25 
    X16=-X26 
    X10=180-X10 
    RT 
    P1 
    COP899 
    X10 
    X89=2 
    LB1 
    SH F 
    COP899 
    ?OUTPUT IMAGE? 
    -X15,-X16 
    SD X50,X10,X15,X16,X25,X26,X89 
    ?DOCUMENT FILE? 
    RE