Planet Four: Probing Springtime Winds on Mars by Mapping the Southern Polar CO$_2$ Jet Deposits

The springtime sublimation process of Mars' southern seasonal polar CO$_2$ ice cap features dark fan-shaped deposits appearing on the top of the thawing ice sheet. The fan material likely originates from the surface below the ice sheet, brought up via CO$_2$ jets breaking through the seasonal ice cap. Once the dust and dirt is released into the atmosphere, the material may be blown by the surface winds into the dark streaks visible from orbit. The location, size and direction of these fans record a number of parameters important to quantifying seasonal winds and sublimation activity, the most important agent of geological change extant on Mars. We present results of a systematic mapping of these south polar seasonal fans with the Planet Four online citizen science project. Planet Four enlists the general public to map the shapes, directions, and sizes of the seasonal fans visible in orbital images. Over 80,000 volunteers have contributed to the Planet Four project, reviewing 221 images, from Mars Reconnaissance Orbiter's HiRISE (High Resolution Imaging Science Experiment) camera, taken in southern spring during Mars Years 29 and 30. We provide an overview of Planet Four and detail the processes of combining multiple volunteer assessments together to generate a high fidelity catalog of $\sim$ 400,000 south polar seasonal fans. We present the results from analyzing the wind directions at several locations monitored by HiRISE over two Mars years, providing new insights into polar surface winds.


Introduction
Mars has a predominantly CO 2 atmosphere with pressure levels buffered by seasonal CO2 polar caps [Leighton and Murray, 1966]. In the winter atmospheric CO 2 falls as snow or condenses directly onto the surface, forming a seasonal ice layer with a thickness of up to 1 m, depending on the latitude. In the spring the south polar region of Mars exhibits a host of exotic phenomena associated with sublimation of the seasonal CO 2 polar cap, and sublimation winds [Smith et al., 2001] contribute to atmospheric circulation.
In the south polar region images from the Mars Reconnaissance Orbiter (MRO) High Resolution Imaging Science Experiment (HiRISE, McEwen et al. [2007]) document activity best described by the "Kieffer" model Kieffer, 2007;Piqueux et al., 2003a]: 1. Over the winter CO 2 anneals to form a translucent slab of impermeable ice. Penetration of sunlight through the CO 2 ice, which warms the ground below, results in basal sublimation of the ice.
2. The laboratory measurements done by Hansen [2005] show that up to 70 % of the solar energy that reaches the top surface of a 1 m thick slab layer can be transmitted through it.
Recent laboratory experiments by Kaufmann and Hagermann [2016] were able to trigger dust eruptions from a layer of dust inside a CO 2 ice slab under Martian conditions, lending further credence to the proposed CO 2 jet and fan production model.
3. Trapped gas escapes through ruptures in the ice, eroding and entraining material from the surface below [de Villiers et al., 2012].
4. When this dust-laden gas is expelled into the atmosphere the dust settles in fan-shaped deposits on the top of the ice in directions oriented by the ambient wind, as shown in Figure 1 [ Thomas et al., 2010Thomas et al., , 2011.
5. When the layer of seasonal ice sublimates in summer, the fans fade, as the material mostly blends back into the surface .
6. The compressed CO 2 gas streams of the jets are believed to erode the surface, carving uniquely Martian spidery channels originally identified in images from the Mars Orbiter Camera [Piqueux et al., 2003b], now referred to as araneiforms .
The number, time history, area covered and changes in direction of the fans provide a wealth of information on the spring sublimation process and spring winds. Apart from few wind direction estimations from remotely observed dunes [Ewing et al., 2010] and surface rover wind measurements [Greeley et al., 2006;Newman et al., 2017], no wide spread wind measurements exist for Mars. The science goals enabled by cataloging fan measurements fall into two categories: 1. Enhance our understanding of spring winds and provide constraints for global and mesoscale circulation models. The length, width, and direction of these fans are snapshots in time of the local wind direction. Changes in the orientation of the fans over time records changes in wind direction. These markers can be compared to predictions from global and mesoscale circulation models (e.g. Smith et al. [2015]) to improve our understanding of Mars' weather in the polar regions. Dust injected into the atmosphere can be estimated.
2. Extend our understanding of the sublimation process and its efficacy as an agent of change on the Martian surface. The number of fans as a function of time record sublimation activity while the overlying ice thickness and insolation change during the season. The areal coverage of the fans allows us (with reasonable assumptions about particle size) to estimate the amount of material eroded from the surface on seasonal timescales. Inter-annual variability and the relationship of timing of seasonal activity to global dust storms can be quantified with this data-set (These are topics of future papers).
Although the value of this data-set is clear, the sheer number of fans (on the order of hundreds of thousands) present in HiRISE images from multiple locations and times observed over many Mars years has proven to be a daunting data-set to catalog. Attempts at developing automated detection algorithms have been unsuccessful at identifying the locations and shapes of these seasonal fans in images from orbit in a reliable fashion [Aye et al., 2010]. However, there is an increasing interest to use the outcomes of Citizen Science projects as training data for neural networks (e.g. Alger et al. [2018]; Banerji et al. [2010]; Bird et al. [2018]; Bowley et al. [2018]; Nguyen et al. [2018]; Peng et al. [2018]), hence we believe that these two lines of research will become strongly complimentary in the near future.
The task of mapping the dark fans is simply pattern recognition, and the human brain is ideally suited for this task, easily capable of spotting and outlining these features. With the advent of the Internet, tens of thousands of people across the globe can be enlisted to assist scientists with tasks that are impossible to automate. This citizen science or crowd-sourcing approach, where independent assessments from multiple non-expert classifiers are combined, has become an established technique as the data volumes have continued to grow. This method has been applied to nearly all areas in astronomy and planetary science [Marshall et al., 2014] (see reference therein) including galaxy morphology [Lintott et al., 2008;Willett et al., 2013], identification of planet transits Schwamb et al., 2012], crater counting [Bugiolacchi et al., 2016;Robbins et al., 2014] and to a sister project of the here presented efforts, Planet Four: Terrains [Schwamb et al., 2017b]. In collaboration with the Zooniverse 1 [Fortson et al., 2012;Lintott et al., 2011], the largest collection of online citizen science projects, we have developed Planet Four 2 , a web portal to enlist the general public to identify and map the seasonal fans in HiRISE images of Mars' polar regions.
In this paper we present the first results from the Planet Four project, a catalog of seasonal fans from two Mars years, MY 29 and 30, of HiRISE monitoring of the Martian South Polar region. In Section 2, we provide an overview of the HiRISE South Pole Seasonal Processes Monitoring Campaign and the specific HiRISE observations used in this study. In Section 3, we present the Planet Four project and the online classification interface. Section 4 details the process for assessing and combining the volunteer classifications to create a catalog of seasonal features. In Section 5 we examine our catalog's validity by comparing results between volunteers and science team members. Section 6 presents general statistical results of the catalog, and finally, we use the catalog for an initial probing into regional winds in Section 7. We summarize our conclusions in Section 8. All place names referred to in this paper are informal and not approved by the International Astronomical Union. Full machine-readable versions of the catalogs and tables presented in this paper are also available from https://www.planetfour.org/results.

HiRISE Instrument and Seasonal Processes Monitoring Campaign
The Mars Reconnaissance Orbiter (MRO) has the ability to turn off nadir to target a specific location. In its inclined orbit there are numerous opportunities to achieve repeat coverage in the polar region. In order to study seasonal processes the HiRISE team selected a limited number of regions of interest (ROIs) in the Martian south polar region to image throughout the spring season. Time is defined on Mars by the orbital longitude L s , where southern spring begins at L s =180°.
Originally, the HiRISE monitoring campaigns were numbered by their ordinal number of seasons the MRO mission had been observing Mars. This work focuses on the observations from seasons 2 and 3 which have more regular repeat HiRISE imaging of ROIs over multiple years, compared to season 1 HiRISE monitoring campaign. To be able to compare with other missions and modeling, we also identify our data using the convention of Martian years, established by Clancy et al. [2000] and Piqueux et al. [2015], where Mars Years 29 and 30, also written as MY29 and MY30, correspond to HiRISE seasons 2 and 3. Every day, citizen scientists are making more fan measurements for later Mars years and the catalog continues to grow. The longer timespan covered by the catalog will be discussed in future paper(s).
Figures 4 and 5 provide an overview of the observed locations and times in solar longitudes of the HiRISE data used in this work. Table 1 lists the ROIs selected for analysis using Planet Four. 221 high quality images from southern spring season 2 and 3 (i.e. MY 29 and 30) were selected for analysis on Planet Four (see Table 2). The reduced HiRISE products were obtained from the National Aeronautics and Space Administration's (NASA) Planetary Data System (PDS) HiRISE PDS Data Node 3 .
HiRISE is a pushbroom imager. It has ten 2048-pixel detectors in the cross-track direction, which covers ∼6 km at the spacecraft altitude of 300 km (MRO is in an elliptical 255 km by 320 km orbit). An image is built up in the along-track dimension as the spacecraft travels in its orbit, with a ground velocity of ∼3 km s −1 . A typical size image has ∼60,000 pixels along-track, thus covers a (6 × 18) km 2 area. Color is available in the center 20 % of the image. A full description of the camera is found in McEwen et al. [2007].
It is generally easier to identify the fans in the color portion of the image, so only the ∼1 km wide color (RGB) sub-image was used for the Planet Four image set. A visitor to the Planet Four website is presented with a sub-image from a RGB non-mapped projected HiRISE image. Each HiRISE frame (typically several hundred megabytes in size) is divided into 840 × 648 pixel subimages that we will refer to as "tiles". To avoid edge effects, the tiles are generated such that there is a 100-pixel overlap with the neighboring tiles. We avoid showing volunteers tiles where part or most of the tile is blank. Due to the variable length and width of HiRISE images, there is typically a small region on the right and bottom edges of the non-map projected HiRISE image that cannot be made into a full-sized tile and thus is not searched for seasonal features with Planet Four. Pixel sampling scales per tile are typically 24.7 cm/pixel when HiRISE is in 1 × 1 binning mode, and the seasons 2 and 3 observations span binning resolutions of 1 × 1 to 4 × 4. For the seasons 2 and 3 monitoring campaign, a HiRISE image is associated with 36 to 635 tiles (see Table 2

Planet Four
Here we describe the Planet Four classification interface and the information generated by volunteers visiting the Planet Four website.

Classification Web Interface
Planet Four volunteers are asked to identify and outline fans in the presented tiles. Sometimes the fan has an indeterminate direction, in which case we call them "blotches". Although less useful for wind regime studies the blotches are sites where the ice has ruptured and released material, so they are important to studying the sublimation process of the polar CO 2 ice sheet. Thus, volunteers are asked to identify and mark blotches as well. Positions, orientations, and sizes of fans and blotches are obtained via a web interface (see Figure 6) built upon the Zooniverse's Application Programming Interface (API), which communicates with their custom built Ouroboros web platform (described in Appendix A). Each tile is assessed by approximately 30-100 independent reviewers. To ensure reviewers have no prior information that may influence their judgment, tiles are randomly served to the classifier, and no identifying information about the parent HiRISE image is presented in the Planet Four web interface. The volunteer is blind to the location on the South Pole, time of season the observation was taken, and responses from other classifiers while reviewing a given tile. Planet Four was launched originally in English; later on the websites, classification interface, and help material have also been translated into several languages , including traditional and simplified character Chinese, German, and Magyar (Hungarian). For the analyses presented here, all Planet Four classifications are treated the same, regardless of what language the volunteer was using in the classification web interface.

Tutorial
First time visitors to the Plant Four website are presented with a short inline interactive tutorial that explains the task and guides the classifier on how to use the marking tools. Additional training material is also available elsewhere on the site. The tutorial is shown only once for those classifiers using the Planet Four web interface logged-in with a registered Zooniverse account. Volunteers using the site in the non-logged-in mode, are presented with the tutorial each time they visit the Planet Four website. Other than the frequency of the tutorial appearing, the user experience on Planet Four, including the tutorial content, are exactly the same for logged-in or non-logged in volunteers.

Marking Tools
Fans and blotches are drawn by selecting the appropriate tool in the classification interface (see Figure 6), clicking on the tile displayed, and dragging to resize the marker to the appropriate shape and orientation. The fan tool generates a triangle with a rounded base with the user controlling the endpoint of the fan. The default opening angle for the fan marker is set to 5°. The blotch tool simply produces an ellipse with the user controlling the size and orientation of the major axis. For blotches, the default length of the minor axis is 0.75 times the pixel length of the major axis drawn. Once a blotch or fan marking has been made, a classifier can edit the initial parameters by manipulating handles on the marker. For blotches, the length of the major and minor axes and rotation can be adjusted. For fans, the opening angle, orientation, and length can be modified. If only a single mouse click is made on the interface, than the minimum sized fan or blotch marker is produced: a fan with a length of 10 pixels and an opening angle of 1°or an ellipse with both axes equal to 10 pixels. Additionally, there is an 'Interesting Feature' tool available for volunteers to highlight the position of anything that they deem worth review by the Planet Four Science Team. The Interesting Feature marker is not resizable. All markers drawn in the web interface can be repositioned or removed by the classifier. Figure 6: The fan (above) and blotch (below) marker on the Planet Four tutorial image. Black circles and diamonds are the marker handles that can be used to adjust the shape and orientation in the web classification interface. The "x" is used to delete the marker.

Classification Database
Once the volunteer is done making markings, if any, and hits the 'Finished' button, the classification (which we define as the sum total of all the markings or lack of markings made by the volunteer) is submitted to the Ouroboros API to be saved to a database. At this point, the classifier can move on to view the next tile by hitting the 'Next' button or can choose instead to enter the Planet Four discussion tool (discussed in further detail in Section 3.3). Once the classification has been submitted, it cannot be revised. For blotches, the center position, rotation angle, and pixel lengths of the major and minor axes of the ellipse are recorded. For fans, the starting position, distance in pixels from the starting point to the end of the fan, opening angle, and rotation angle are saved to the database. For interesting features, only the pixel location is stored. If no features are marked, the database records the classification as a non-marking. A tile identifier and timestamp for each classification is also stored in the database.
If the volunteer is logged in with a registered Zooniverse account, the classifications are tracked in the database via the associated username. For non-logged-in classifications, a unique session id is generated and used to link the classifications completed by a given IP address and web browser. The non-logged-in identifier does not exactly correspond one-to-one to a unique individual. If a person classifiers non-logged-in and changes their IP address, their new classifications would be stored under a different identifier. Additionally, if a volunteer initially participates as a non-loggedin classifier on Planet Four and then registers for a Zooniverse account, the previous classifications stored in the database are not linked to the Zooniverse username and remain associated with the unique non-logged-in session identifier.
We note there are occasional spurious or duplicate entries stored in the classification database, typically due to a glitch in the classifiers' browser or a minor bug in the Ourborous framework. These entries compose a very small percentage of the total volunteer classifications. They are easily identified and removed from the analysis presented here. Further details are provided in Appendix B. Additionally the Planet Four classification interface originally recorded a different angle than the intended spread angle from the fan marking tool. This was identified and subsequently fixed in the software. The true spread angle of the fan marker drawn by the volunteers is recoverable from the values stored recorded in the database, and we have adjusted the classifications effected.

Talk Discussion Tool
Associated with the Planet Four classification interface is a dedicated object-orientated discussion tool known as "Talk" 4 . Each Planet Four tile assessed on the main classification interface has a dedicated page on the Planet Four Talk website. Volunteers can access these pages directly through the classification interface after submitting their classification. With Talk, volunteers can write comments, add searchable Twitter-like hash tags, create longer side discussions, and group similar tiles together in collections. For the analysis presented here, we focus strictly on the volunteer markings from the main user interface, and do not include a complete analysis of the data from the Talk tool.

Site History
Planet Four was publicly launched on 2013 Jan 8 as part of the British Broadcasting Corporation's (BBC) Stargazing Live, three nights of live astronomy programing (2013 Jan 8-10) on BBC Two in the United Kingdom. Review of Season 2 and 3 tiles span from January 2013 to March 2015 with 9,809,637 classifications produced in total. The majority of classifications for Seasons 2 and 3 were obtained during the BBC Stargazing period, but subsequently data from HiRISE's other seasonal monitoring campaigns were mixed with the Season 2 and Season 3 classifications. The results from data outside season 2 and 3 which are still in the process of being reviewed on the Planet Four website will be the topic of subsequent publications. Figure 7 plots the distribution of classifications per tile for Seasons 2 and 3. Due to the high classification rate at launch, tiles were set to retire from rotation in the web interface after 100 independent assessments (counting duplicates) to ensure that the project would continue to serve data over the Stargazing period. Over time the classification rate dropped significantly from launch, and on 2013 Dec 9 the retirement threshold for a tile was lowered to a more reasonable -and statistically acceptable -value of 30 to better accommodate the actual work rate on Planet Four. This value is similar to the image retirement threshold that was used by the Zooniverse's Milky Way Project , which enlists the general public in a similar task, drawing circles on space-based infrared images to identify the shape and size of star formation bubbles.

User Statistics
36,433 registered volunteers and 48,094 non-logged-in sessions have classified at least one tile in our MY29/30 data-set. Volunteers made in total 9,461,062 classifications with a median of 7 and average of 41 classifications per registered volunteer/non-logged-in session. The highest number of different classifications (i.e. submitted Planet Four tiles) by the same volunteer was 31,808. After clean-up, Planet Four volunteers drew a combined 3,460,056 blotches, 2,694,415 fans, and 805,903 interesting features. Figure 8 shows the distribution of volunteer classifications for Seasons 2 and 3 tiles combined. Individual registered volunteers (median of 14 and average of 69 classifications per user) tend to contribute slightly more classifications than a individual non-logged in session (median of 4 and average of 21 classifications per session). A given volunteer/session reviews only a small percentage of the entire sample of HiRISE tiles. Only 15 % of classifiers (12,483 registered volunteers and non-logged-in sessions) have contributed more than 50 classifications. Most volunteers contribute a few classifications of Planet Four tiles before leaving the site. This is a typical response for web-based projects [Crowston and Fagnot, 2008;Zachte, 2012] and is similar to the volunteer behavior found on other Zooniverse projects [Sauermann and Franzoni, 2015].

Data reduction
In order to create fan and blotch object catalogs from the Planet Four markings, a reduction pipeline was implemented, for which the code is open source and made available 5 . The pipeline is based on the Python programming language, interfacing also to the US Geological Survey's (USGS) Integrated Software for Imagers and Spectrometers (ISIS) [Anderson et al., 2004;Becker et al., 2007], and making use of the "scikit-learn" package for machine-learning related tasks [Pedregosa et al., 2011]. This data reduction pipeline has five main conceptual stages (see Fig. 9): Cleanup, where the Planet Four classification data is cleaned, normalized and converted to a binary database (Section 4.1), Clustering, where the markings of the many different volunteers are being combined into, ideally, one resulting average object (Section 4.2), Combination, where we combine fans and blotches markings that seem to address the same visible object in the image into a meta-object for further processing during the next stage (Section 4.3), Thresholding, where a cut on the required number of volunteers that voted for either fan or blotch will decide if the previously created meta-object should be considered a fan or a blotch (Section 4.3.1), and finally Ground Projection, where we project the HiRISE image pixel coordinates of the resulting fan and blotch markings into latitude and longitude coordinates on Mars (Section 4.4). 5 The pipeline is located at https://github.com/michaelaye/planet4.  Figure 9: Overview of conceptual steps of the Planet Four data reduction pipeline.

Database Cleanup
After the removal of the tutorial data (see 3.1.1), and a first cleaning for spurious, incomplete and duplicate classification database entries (see Section Appendix B), we normalize all angles from the Planet Four classification interface, and finally produce a binary database in the format of HDF5 (Hierarchical Data Format, version 5) for the remainder of the data processing. Normalizing of angles is required because the Planet Four system records blotches with an angular range from -180 to 180 while ellipses possess a degree-2 rotational symmetry. This means only the range of 0 to 180 degrees is required to fully describe blotches, once the radii are sorted in a consistent way (semi-major axis first). Volunteers randomly start to draw the ellipses required to mark blotches either from the semi-minor axis or the semi-major axis, making it error-prone to cluster on these parameters without normalization. The cleaned raw Planet Four classifications as used by this work's analyses are provided as supplemental data to this work in the file P4_catalog_v1.0_raw_ classifications.csv. Further details about the format of the raw classifications are described in Appendix C.

Clustering
We identify fans and blotches by combining together the multiple volunteer assessments from each Planet Four tile. To identify and precisely locate the marked features from the multiple classifications performed by many (between 30-100, see Appendix A) volunteers per Planet Four tile, we perform a clustering analysis on the data. Figure 10 shows an example of fan markings for a Planet Four tile. After having evaluated several different clustering algorithms, we have identified the Density-based Spatial Clustering of Applications with Noise (DBSCAN) clustering algorithm of Ester et al. [1996] as the most appropriate one for our application. DBSCAN has the advantage of not requiring the number of expected clusters as input, instead it is controlled by two input parameters describing the minimum number of members of a cluster (min_samples) and the maximum distance for a data point to be included into a cluster (epsilon). (Details on how we determine these parameters are described in Section 4.2.1.) We set up our clustering pipeline using the DBSCAN implementation in the scikit-learn Python library [Pedregosa et al., 2011]. All volunteer responses are treated the same with equal weight in the clustering algorithm. Due to the differences in the classification interface for marking fans and ellipse-shaped blotches -fans are drawn from a base point vs blotches drawn from the center -the fans and blotch markings are clustered separately at this stage, and require their own set of clustering parameters.
In a first stage, we cluster the data for Planet Four tiles each on the (x,y)-pixel-coordinates of the base point of fans and of the center for blotches (see Fig. 12 for a visual description of the available coordinates of the markings.). Figure 13 shows the result of clustering in two dimensions of the x and y base coordinates of the fan markings, using a multi-step approach as shown in Fig. 11, as described below. Once the clusters for a given set of parameters (see Section 4.2.1 for details on the parameter tuning) have been defined, the original marking data for each cluster members are averaged to create one average marking object per cluster, including average directions for fan objects, e.g. in Fig. 13. The number of markings that went into the creation of the averaged object is stored for later.
After having clustered both fans and blotches on their base and center coordinates respectively, we apply a second stage of clustering on the markings. For fan deposits, the major objective of this With such a large number of different volunteers classifying, the "sensitivity" for detection is increased, as notable by a few markings that outline even the smallest potential dark deposit candidates. However, when the "crowd" does not agree with these, i.e. if the potential cluster does not reach the min_samples number of required members, the clustering pipeline discards these entries, as shown in Fig. 13. Fan clustering Blotch clustering Figure 11: The sequence of clustering steps for both fan and blotch markings. It became apparent during our studies, that fan markings show less scatter, probably due to the tool having to be placed at a clearly identifiable base point. Blotches, however, do not show a clearly identifiable center, and their outline is often less sharply defined, creating a wider distribution of marking results, especially for larger blotches. This required a second run of clustering with more relaxed cluster parameters, as described in Section 4.2.1 and in Table 3.  For direct comparison, this shows the same as Fig. 10 on the right, on page 19. Right: Results after clustering, identification of noise markings, and averaging the cluster members' data into one object per cluster. Markings that do not become member of a cluster are defined as noise and will be discarded from further processing (shown as white dots). work is to determine the wind direction they indicate. Due to this we want to be able to distinguish between different wind directions from the same source point, i.e. multiple subsequent eruptions, where later eruptions occurred with a different prevalent wind direction. In the Planet Four help content we have emphasized that the volunteers should outline several fans if they appear to start from the same source point. This is very relevant for data like that in Fig. 14, to identify several wind directions indicated by the fans, from multiple subsequent jet eruptions. By clustering not only on the base coordinates (x, y) but also on the recorded alignment angle of the fan markings, we are able to distinguish these subsequent fan deposits with different wind directions. We have determined by reviewing the clustering results of a subset of the data that 20 degrees as a clustering value for angles enables this objective. It means that fan markings that have an alignment angles further away from each other than 20 degrees are clustered into their own subcluster, even if they start at the same base point. Blotches, on the other hand, are used for deposits that do not clearly indicate a direction, which is why we do not apply an angle clustering here. However, blotches do not show a clearly identifiable center, and their outline is often less sharply defined, creating a wider distribution of marking results, especially for larger blotches. Thus, we cluster also on the resulting ellipse radii for the blotches to ensure that we identify the statistically most common shape of the volunteer's blotch markings.
The values of the clustering parameters strongly influence the number of identified features. We therefore studied extensively, how precisely they affect our results by reviewing random subsets of the data-set, which led to the empirical determination of the clustering parameter values

Marking Dimension Small Large
Fans xy (base) 10 px NA angle (deg) 20 NA Blotches xy (center) 10 px 25 px radius (px) 30 px 50 px Table 3: Empirically determined epsilon values for the clustering pipeline. NA: Fan markings did not require a second clustering run with relaxed precision on the distance, apparently the fact that a fan requires drawing from a distinguishable starting point helped the volunteers to keep the scatter small, both in base coordinates and angle precision.
that we eventually used for the catalog production. These procedures will now be discussed in the following sections (see Fig 4.2.1. Cluster parameters min_ samples . As described in Section 3.2, Planet Four tiles have varying numbers of user classifications, thus the classifications for each Planet Four tile are clustered separately, with a variable requirement on the min_samples clustering parameter. More classifications for a Planet Four tile means that we have a higher "sensitivity" to smaller features (see for example Fig. 10, right), so to achieve a uniform detection efficiency, we implement a scaling factor on the required number of samples per cluster. This results both in a higher sensitivity to have seasonal fans and blotches marked and higher precision averaged objects at the end of the clustering process. In other words, the signal-to-noise ratio (SNR) is higher for a Planet Four tile that was classified by a larger number of volunteers and we adapted the clustering process to normalize for that fact.
To address the variable SNR in our data, we empirically determined a scaling factor min_ samples_factor (MSF) that, multiplied with the number of classifications that contain blotch or fan markings, results in the min_samples value for the DBSCAN algorithm: min_samples = round min_samples_factor · n markings , with n markings ≤ n classifications , the number of classifiers that have added either blotch or fan markings as classifications.
The best value for MSF was empirically found to be at 0.13. For example, when a Planet Four tile has n class = 30 classifications (our current retirement value), n class will be 4. This value now provides the number of cluster members min_samples that is required for a cluster to be created. When a tile has 70 submissions, however, it would result in the requirement of having 9 cluster members to be deemed a real detection and to be entered into the next stage of the pipeline. This way, we are exploiting the higher sensitivity from the larger number of submitted classifications.
epsilon . The second DBSCAN parameter, epsilon, describes the largest distance that two points are allowed to have, for them to be considered to be in the same cluster. The dimension for this measurement depends on what mathematical feature is currently being clustered. When we cluster on the base point coordinates of fans, the central point coordinates or semi-radii of blotches, the feature space is measured in pixels, while fan angles are clustered in degrees. The size scale of the dark fans and blotches varies significantly between different regions of interest at the south pole of Mars. Trying to cluster our data with only one value of epsilon, we realized that it was not possible to simultaneously resolve small markings on the order of 20 pixels properly that were precisely positioned by the volunteers, while also clustering successfully markings of much larger deposits that could stretch more than half of the Planet Four tile that was shown to the volunteers. The spread in marking coordinates is smaller for smaller features -we think because of an increased focus to detail for smaller features -, and thus, to ensure identification of large features, we implemented a second stage of clustering with larger allowed values for epsilon. The resulting values in Table 3 were selected empirically after review of a random subset of the pipeline output. Fig. 15 shows an example parameter scan review graphic that the science team used to determine the parameter values that work best for our task.
Figure 15: This figure shows our review plots for determining the best clustering parameters for Planet Four tile ID 1cl. In this example, we review the fan clustering with a group of 2 different min_samples values, controlled by using a min_samples_factor of 0.1 and 0.13 respectively, leading to min_samples values of 5 and 7. Additionally, we are scanning the epsilon (EPS) value for small deposits with the settings 10, 20, and 30 pixels, while the EPS_LARGE value stays at 25 pixel for these runs (having no effect in this case due to the small size of markings). The upper left 3 plots are for the setting of MSF=0.1 (resulting in a min_samples value of 5), and EPS between the 10, 20, and 30 pixel values. Then, the second group with an MSF of 0.13 (resulting in min_samples=7), starts in the upper right with the fourth plot in the upper row, and continues in the lower left with the first two plots, again showing the tests for EPS values 10, 20, and 30 pixels respectively. The last two plots in the lower row provide us with what the volunteers actually marked and what they received as input for the markings, the Planet Four tile, cut out from the larger HiRISE images. The number of fans clustered varies significantly for different clustering parameter values, with n between 11 and 16. We favor the setting in the upper right plot, for identifying correctly all small center fans, while not creating an object for the small black spot at the top of the image tile. To reach this, the results from Lower Middle and Lower Right are being compared, and the higher voted markings at comparable locations win. How high that winning ratio must be to be entering the final catalog is determined by the threshold value (see Section 4.3.1). Note, how the center fans are cleanly identified and winning in the voting competition with the blotch at the same location. The opposite is true for the the small object identified at the middle left, where a red blotch marking has won against the small cyan fan.

Combination
When the direction of fan deposits are not very pronounced, i.e. the prevalent winds were weak at the time of the jet eruption, there is ambiguity in identifying the deposit as a fan or a blotch. This can result in a given ground source having both survived clusters of fan and blotch markings that need to be combined in a strategic way to create a final object category for the observed ground source that will be listed in the resulting object catalog. We make use of the relative frequency of which marking tool was used to create both marking clusters to identify how fan-like a source is. For example, if 5 people classified a marking as a fan, but 5 other people marked it as a blotch, we assign a fan probability P(fan) of 0.51 by applying P (fan) = n fans + 0.01 n fans + n blotches , with n fans and n blotches the number of volunteers that marked either. The fudge value 0.01 is required to be able to make an either-or decision for the object when n fans = n blotches , flipping the switch in this close call for fans instead of blotches, due to the usefulness of fans for further scientific analysis. We determine to which markings this procedure is applied by calculating the pair-wise Euclidean distance for all clustered objects and check if clusters are within a chosen limit of 30 pixels with each other. We chose this value for allowing slightly more imprecision in the markings' positioning as the clustering algorithm that went into creating these average, but without combining too many markings that really should be individual items. We have reviewed several hundred subsets of data and determined 30 pixels to be a good compromise on these competing tasks. If a distance pair meets the combination criterion, we use above formula to calculate P(fan) for this pair of markings. This value goes from 0 to 1 with 0 being a definite blotch when n f ans = 0 and 1 indicating a definite fan when n blotch = 0, in other words either none or all volunteers had drawn a fan or a blotch, respectively. We then create a meta-object for this pair, storing P(fan) under the name 'vote_ratio' in the catalog files, together with all other data for both objects. We do this to enable future users of the catalog to decide on their own how reliably a marking is required to be a fan before it shall be used as such, with its data entering a study. In other words, a specific study might require to only use the most clear fan markings, maybe with a P(fan) of larger than 0.8. Applying such a cut is called Thresholding in our pipeline, described in the next section.

Thresholding
For concrete applications, e.g. for this publication, a scientist can now apply a cut on P(fan), that will write out the decision to a new catalog file with fans and blotches. For example, a cut on P(fan) of 0.8 would mean that all meta-objects with a value of smaller than 0.8 will be written out as the underlying blotch, while for meta-objects with a value of larger than 0.8 the stored fan will be written out. In both cases, the remaining data of the meta-object that was thresholded against will be dropped for the newly created catalog file, but it is still available for other thresholding operations as an intermediate data product. An example use case would be that a scientist wants to study the sensitivity of their research on the applied cut, for example, if we want to provide wind direction data to a mesoscale climate simulation, we might want to make sure that only the most certain directions are being used and would apply a higher cut on the meta-object value.
For the catalog that we deliver with this work, we chose a simple majority threshold of 0.5, so that the catalog offers the broadest use case. Choosing simple majority means that we take a marking as a fan from the moment that at least an equal amount of volunteers have classified an object as a fan and as a blotch. Catalog files with this applied P(fan) threshold of 0.5, all intermediate data products, and instructions on how to apply a threshold for writing out new catalog files will be provided as supplementary products (see Appendix D for more details).

Ground Projection
For each Planet Four tile, the clustering in volunteer-drawn markings to identify seasonal sources is performed using the pixel positions of Planet Four tiles. Once the cluster dimensions and position has been identified, the source's true location on the South Pole must be calculated. However, the HiRISE team-generated non-map projected color mosaics the Planet Four tiles are derived from do not contain the spacecraft information necessary to compute the latitude and longitude per pixel. We partially reconstruct the mosaics from the raw HiRISE image products or Experiment Data Records (EDRs) building a red filter only composite image with the necessary spacecraft information required to perform coordinate transforms. The HiRISE EDRs were obtained from the NASA's Planetary Data System (PDS) HiRISE PDS Data Node. We developed a reduction pipeline in Python using the US Geological Survey's (USGS) Integrated Software for Imagers and Spectrometers (ISIS) 6 [Anderson et al., 2004;Becker et al., 2007] and the ISIS-3 Python wrapper Pysis 7 for this purpose.
We briefly summarize the steps as shown in Fig. 18 including the required ISIS-3 application names, to generate the red filter-only mosaic. We start with the center two RED filter CCDs (RED 4 and 5), each with two readout channels. All four EDR files (2 for each CCD) are read in and converted to ISIS-3 cube format, and the SPICE (Spacecraft & Planetary ephemerides, Instrument C-matrix and Event kernels) information for MRO is added to the EDR headers. For each CCD, we combine the two channel EDRs into a single image. The combined image is then normalized to remove both the striping and left/right normalization effects. This is not a necessary step for obtaining map project information but makes it easier to visually inspect the final combined mosaic. Once both CCDs have been reduced they are combined in a final mosaic accounting for the 48 pixel (in 1×1 binning) overlap.
Once the single filter red mosaic is made, we are able to translate any fan and blotch pixel position to latitude and longitude on the south pole using ISIS-3's campt application. The catalog tables P4_catalag_v1.0_L1C_cut_0.5_fan_meta_merged.csv -and _blotch_meta_ merged.csv respectively -, provided as supplemental files include the cluster coordinates as latitude/longitude derived from this process, as well as a set of positional coordinates (X,Y,Z) in the body-fixed reference frame for Mars, measured in kilometers.

Overlap regions
As previously mentioned in Section 2, to avoid edge effects, the cutting down of HiRISE images into screen-sized tiles is performed such that there is a 100-pixel overlap with the neighboring tiles. This way, at least in one of the tiles of an area fans and blotches that cross the boundary between tiles will be visible completely. However, from our own Planet Four marking efforts and from analyzing results from Planet Four volunteers, we have determined that the classification tools do provide such high level of precision in placement, that many volunteers position and push a fan or blotch marking out of bounds of the shown image area to make it fit a partially shown fan or blotch. This results in several markings for the same object stemming from different Planet Four tiles, as shown in Fig. 19. It can be seen in this figure that the directions of fans are matching, despite the fact that some tiles only showed a small part of a fan in the overlap area. We hence conclude that a wind direction analysis is not adversely affected by this analysis artefact. For a future study focusing on area covered by markings and counts of fan and blotch activity, we will implement a merging procedure to remove multiple markings, similar to the Combination step in our pipeline, as described in Section 4.3. , b0a (all 3 letter tile_ids need to prepend 'APF0000' for the full ID). The shape of the tiles are distorted compared to their displayed on-screen size for this plot. Each tile was clustered individually, indicated by the different marking colors. The solid lines indicate where an unshared division between the tiles would lie, the dashed lines show the overlap region that was added to each tile to maximize available information for the volunteers. This plot is instructive in showing how the marked fans, specifically their directions match very well, despite the fact that sometimes only a very small part of the whole fan marking was visible to the classifying volunteer. For increased precision in total marking counts and the area covered by markings we will design an object merging procedure on these overlap regions (next paper).

Data Validation
To date, there is no published catalog of the locations and numbers of seasonal defrosting features for any of the HiRISE images of the Martian south polar region to compare to the Planet Four results. In order to assess the accuracy and recall rate of Planet Four and confirm the majority of fans and blotches present in the HiRISE observations are identified when combining multiple classifier markings, we have created a 'gold standard' data-set based on expert assessment. Using the same classification interface and markings tools on the Planet Four website as the citizen scientists used, the Planet Four Science team reviewed a subsample of the Seasons 2 and 3 tiles and produced a catalog of markings. Similar validation processes have been applied in analyses of our previous Planet Four publication for the sister project Planet Four: Terrains [Schwamb et al., 2017a] and to crater counting crowd-sourced data for the Moon [Bugiolacchi et al., 2016;Robbins et al., 2014].
To generate the gold standard data-set, 960 Season 2 tiles and 767 Season 3 tiles were randomly selected and equally divided amongst the three of the primary Planet Four Science Team members (GP, KMA, MES) to review. This corresponds to 3 % of the tiles from each season classified on Planet Four. Additionally another 192 tiles, both from Season 2 and 3, were randomly chosen and classified by all science team gold standard classifiers in order to compare the science team markings to each other. This corresponds to approximately 0.4 % of each season's tiles. The Planet Four tile_ids of the gold standard classifications and the user names of the science team members that did the analysis are provided in supplemental data files P4_catalog_v1.0_gold_standard_ ids.zip. : Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between experts and the catalog data; here, for the 192 common tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 75, omitting single entry bins above.

Counts of objects identified
We use the expert classifications from the science team with our final catalog in order to explore how well fan and blotch features are identified and how accurately the shapes and dimensions are represented in the Planet Four catalog. We show a tile-based comparison in Section Appendix F.1, Expert vs Catalog object identification frequency Figure 21: Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile between experts and the catalog data. Bin size is 5, each bin is directly compared between data from experts (in dark blue) and catalog data (in orange), with the experts GP, MES, and KMA respectively, from top to bottom. Each histogram contains data for 432 tiles, with each expert classifying an independent data-set.
but first we examine the collective properties of the part of the Planet Four catalog that represents the gold standard tiles. We compare and contrast these distributions to the expert classifications together and per expert reviewer. Figure 20 compares the number distribution of identified sources (i.e. fans + blotches) per Planet Four tile between experts and the catalog data for the 192 common tiles that were commonly classified by all three science team members (KMA, GP, MES). Among the expert classifiers there are some visible differences especially where the interpretation of a single image or two dominates the value of the histogram bin. The final catalog is within the variance of the individual expert assessments. We can see this further in Figure 21 which shows the number distribution of identified objects (i.e. fans and blotches together) per Planet Four tile when comparing the results for the tiles that were only classified by one of the science team members. We note that even tiles with 30 or 40 fans and/or blotches are still well represented in the catalog. We also use our expert gold standard classifications to examine the physical sizes and areal coverage of the Planet Four catalog fans and blotches (see Figures 22 to 25). As in previous comparisons, there is good agreement. The differences between the catalog is within the the variance seen between the individual expert classifiers. Differences between the catalog and experts become more apparent when in small number regimes (when <10 sources comprise the bin). These differences between the distributions in these small sizes is consistent with small number Poisson uncertainty on the histogram values [Kraft et al., 1991]. Thus, fan length and blotch areas are well reflected in the Planet Four catalog. Blotch area, expert vs catalog Figure 25: Comparing measured blotch areas between experts and the catalog data. Bin size is 5000, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 120,000, omitting single entry bins above. Of those, we used 39 that had more than 3 fans, for better statistics (the median number of fans per tile is 4, see Section 6).

Wind direction comparison
In this histogram, we show the difference between the mean angle of the fans in these 39 Planet Four tiles between the science team and the volunteers. Overall, we have a good agreement, with a few rare outliers, discussed in the text and in Figures 27 and 28. Bin size is 2. Right: Standard deviations (STDs) of the directions of fan markings that went into each cluster, before they are merged into the average resulting catalog object. This plot shows the distribution of these STDs for the set of 192 common gold tiles, which had a total amount of 904 fans. Bin size is 1. Fig. 26, left, shows a histogram over the differences in the mean-over-tile fan directions between the catalog entries that are clustered from all the volunteers' markings and the average from the three science team members. In general, the agreement is very good, with differences usually smaller than 10 degrees. Another way to investigate our uncertainties is to calculate the angular standard deviation for each cluster member markings that are merged into the final catalog objects, independent on if the markings were done by an expert or a volunteer. Fig. 27 discusses the lower outlier of, indicating that the respective Planet Four tile has a more difficult than usual scenario with a naturally occurring higher variance of the actual deposit directions on the ground. Not only are the deposit shapes visible in the upper left more irregular than usual, there is a visible gradient of directions across this tile, as can be seen by the exaggerated fan pointers. This gradient is probably caused by the basin shapes in the Inca City region that can create a topographical control of the alignment of fan deposits over the usual wind control. However, our reduction pipeline is reliably reducing the markings for every deposit, but with higher than usual variance between orientation and size of the markings. Having no single clear fan direction in the image tile, it is reasonable to expect a higher variance and hence, a higher delta when compared to the 3 science team members.
In a similar fashion, Fig. 28 discusses the high-side outlier of Fig. 26. While fans have been identified, their counts is low, creating low statistics effects by letting small deviations having a larger effect on the comparison with the catalog data. Additionally, the few fans that are visible appear to show different directions, leading to a less certain fan direction with a higher variance, which in turn can lead to larger differences when comparing their values, resulting from low statistics.
In Fig. 26, right, we plot the standard deviations for all 904 fan clusters for the 192 common tiles that were analyzed by all experts. The right end of this histogram is cut off by our angular clustering parameter of 20°, meaning larger angular differences are never clustered together. However, the majority of standard deviations lie far below that safety cut-off value for the clustering. We estimate an average uncertainty for our fan directions of about (5 ± 3)°, using a half maximum width of this histogram. The actual uncertainty highly depends on the quality of the data as given by the HiRISE binning mode and the local variability of winds, leading to increased diffusion of the deposits. We believe these factors lead to the non-Gaussian skew of the histogram.
Additional validation results can be found in Appendix F.

Summary
In conclusion, our catalog has high completion in most cases. Outliers have been found to be caused by special circumstances with more challenging classification tasks, creating higher variance for all classifiers, including the experts. The analysis of the gold standard sample demonstrates that the bulk composition of the Planet Four catalog represents a fairly complete picture of the seasonal fans and blotches captured in the HiRISE images.

Results: Fan and Blotch Catalog
From 221 HiRISE images from Mars years 29 and 30, cut up into 42,904 Planet Four tiles, the Planet Four volunteers produced almost 2.8 million fan markings, that were clustered into 159,558 fans in our MY29/MY30 catalog. In Table 4 we show an example of fan catalog data. For blotches, 3.46 million raw markings were combined into 250,164 blotches. 29.6 % of the image tiles (= 12,693) end up not having any clustered markings in our catalog. Fig. 29 shows the distribution of the fraction of empty tiles per HiRISE image vs. solar longitude. Visual checks of data with fractions above 0.8 confirmed that these HiRISE images are mostly free of CO 2 jet deposits at spring times; in late summer, however, when the seasonal CO 2 ice layer has fully sublimated,fan and blotch deposits are rendered mostly invisible, because they blend into the now ice-free background. A notable exception to this general effect is the ROI Inca City where the summer data, after L s 250 • -260 • , regularly shows fan deposits still discernible. This could point to an interesting difference in the ground soil compactification and its related observed texture. New deposits from CO 2 jet eruptions may be sufficiently different in texture from the background as a result from particle sorting and related phase function changes of the fresher surface.  After L s =260°all CO 2 is gone -earlier at lower latitudes -, and most of the HiRISE images appear empty in terms of identifiable blotches or fans, because any deposits blend with the ice-free background.    6.1. Catalog properties 6.1.1. Fan counts The highest counts of fans and blotches were 167 fans in the tile_id APF00006mr and 278 blotches in the tile_id APF00007t9, shown in Figures 30 and 31. These data serve as an indication of the dedication of the Planet Four volunteers producing results in such high spatial density. The median count of fans and blotches per tile is 4. The distribution of both numbers is shown in Fig. 32.

Fan lengths
As an example of the possibilities of the produced catalog, we describe the measured fan lengths in the catalog. The catalog column distance requires scaling by the values in map_scale, to correct for the different HiRISE binning modes. The distribution of these measurements are shown in Fig. 33. About 97 % of all fans are below 100 m in length, with a median value of 24 m.
The three largest fans measured are all from the same ROI called Manhattan Classic (Lat −86.39°, Lon 99°), having lengths of 373 m, 368 m and 361 m respectively. They were identified in the HiRISE images ESP_013095_0935 (longest) and ESP_011961_0935 (second and third). The two longest fan markings even identify the same fan, but at different times in the season, with the longest observed at L s =265°, and it's shorter self at L s =209°. Being only 5 m different, we attribute the increased marking measure to both material being potentially moved around by winds during spring and a decrease of precision in identification after the CO 2 has sublimed and the deposits start to fade into the background. However, we interpret the fact to have identified the largest fan twice, as a further indication of the high reliability of our results, considering that the random image serving procedure of the Planet Four classification interface ensured that volunteers do not classify images in the order they have been taken, because that would have increased the chances of being biased by their previous classification. In this case, where 119 volunteers classified APF0000dtk with the longest fan, and 54 volunteers classified APF0000de3 with the second longest fan (shown in Fig.14), only one volunteer was identified to be the same.
An overview of the fan lengths distributions for all major ROIs over all 2 Martian years of data is shown in Fig. 34. When compared between Mars years 29 and 30, the total (over all ROIs) fan length statistics are very comparable, with a median of 24.2 m for MY29 and 23.8 m for MY30. However, we identify specific ROIs that have different fan properties between MY29 and 30.

Wind Direction Results from Four Sample Regions of Interest (ROIs)
Early in the mission, HiRISE has defined several regions of interest (ROIs) within the southern polar areas that have been extensively monitored for seasonal activity ever since (the list of original seasonal ROIs can be found in Hansen et al. [2010]). We have selected a sub-set of these ROIs to be analyzed by Planet Four, as shown in Table 1). The map of ROIs' distribution over the pole is shown in Fig. 4.
Below we will focus on 4 example ROIs to showcase the use of Planet Four data catalog and our ability to monitor wind directions using fan markings positions and locations. We have picked these 4 ROIs (informally named Ithaca, Giza, Manhattan, and Inca City) for regional case studies of the seasonal winds because the temporal coverage over these locations is the highest. We describe each ROIs' general settings and geomorphology based on observations of HiRISE and our previous works Pommerol et al., 2011]. We then present the wind direction maps over spring season at each of these locations. The wind rose diagrams for each HiRISE image separately are available in the supplementary files P4_catalog_v1.0_wind_rose_diagrams. pdf

Ithaca
The Ithaca region is located at southern latitude 85.2 • , eastern longitude 181.4 • . This location is away from the permanent polar cap, at the edge of the cryptic region and situated on a surface that is relatively smooth on a large scale: the digital terrain model produced by HiRISE (DTEPD_040189_0950_040216_0950_A01) shows vertical elevation variations less than 60 m across the Ithaca region. At the same time, on the meter scale the surface in Ithaca is rough, showing irregular and uneven bumps and pits. No araneiforms (i.e. radially-organized channels) were detected Ithaca according to HiRISE imaging, while rare isolated troughs and patterned ground similar to araneiform troughs are present .
During local spring, fan-shaped deposits densely cover the Ithaca region (see an example in Fig. 35. Opening angles and lengths of the fans were reported to evolve during spring while the nature of these changes was not quantified . Multiple fans were observed to emerge from the common vents, at times merging together to create a wider singular fan. The directions of the fan deposits were noted to be consistent from one Martian year to another with only little variation.
An interesting detail about Ithaca is very prominent bluish halos and fans that are repeatedly observed here . In contrast to the more common dark fan-shaped deposits, these halos and fans have higher albedos, approaching the albedo of fresh ice deposits. In Ithaca they are also distinctively bluer than the rest of the surface. There are at least two types of such bright deposits. One type resembles narrow fans that are located centrally over the older dark fans. These appear early in spring, before L s = 190°. The other type resembles halos contouring the pre-existing dark fans. They appear on average later than the narrow bright fans.
In summer (L s > 270°), the seasonal deposits are mostly invisible in Ithaca. Partially, this is because the low scale roughness creates a patchy-looking environment with pits being darker than bumps either due to shadows or dust collecting in depressions. Fig. 35 shows a typical plot that we will use to analyze derived wind directions in our ROIs. This particular plot was created from Planet Four data for one HiRISE image (ESP_011931_0945) taken in Ithaca at L s = 207°. To create this plot we took all the fan markings over the HiRISE image and plotted it as a histogram of their directions (top right panel of Fig. 35). Note that, in contrast to the standard wind rose diagrams showing the directions of the origin of winds, we use this diagram to show the measured deposition directions caused by the winds, i.e. the opposite from the wind origins. We decided for this kind of display because it relates more to the actual measurements performed by the Planet Four project and does not imply any interpretation. The fan direction is counted clock-wise (CW) from the North Azimuth (NA) direction, where 0°always represents North, and 270°West. The histogram is not scaled, i.e. the y-axis shows the actual counts of the fan markings with the direction of each bin in the x-axis. The maximum of the histogram is the most probable direction for the markings and the width indicates how variable the directions of the markings are for this particular L s . The default size of each histogram bin is 3.6 • . In exceptionally rare cases for a particular image the number of fan markings and thus number of wind measurements are low. Such cases require special treatment and increase in bin size. On the top left panel the same data are plotted in the wind rose diagram. This time the histogram is normalized to highlight the difference in directions if several HiRISE images are plotted in the same frame. Note that the position of zero (NA direction) depends on the location of ROI, i.e. the wind rose diagram is map-projected to the location of the data plotted. Thus, the direction of the fans can be directly compared to the map-projected HiRISE image (bottom panel). In this particular example one can see that the histogram has 2 peaks that indicate there are two distinct directions of the fans. This can either be (1) because of overlapping fan deposits from jets that erupted from the same vents at different times prior to L s =207.8 • under different wind regimes; or (2) because different areas of the ROI have distinctively different wind regimes. In this example comparing the derived fan directions to the sub-frame of the HiRISE image indicates that the first case is more probable. Fig. 36 shows directions of the fan deposits in Ithaca as retrieved by the Planet Four project for two Martian years: MY29 and MY30. We have separated the spring season into early spring, i.e. before L s =210 • , and late spring, from L s =210 • to L s =270 • . The panels in this figure are organized in the way that columns show separation into early and late spring while rows show MY 29 and MY 30.
Ithaca fans sustain the same direction towards ≈125°through the whole spring in both years with only a little shift towards East (see also top left panel of Fig. 40). In MY29 fan direction histograms are wider than in MY30. A narrow histogram is an indication of small deviations of the governing winds at the times of jet eruptions. The shift of the mean wind direction is less than 10 • in the early spring and the maximum shift is 25 • over the whole season in MY29. Histograms widen with increase of L s and sometimes develop double maxima indicating more variability in the marked fan directions. This is also reflected in the increase of the standard deviation towards the end of spring. It can be attributed to larger wind variability later in spring or that winds become strong enough to lift the particles from the ground at times between jet eruptions. Over-all MY30 show similar behavior to MY29.

Giza
The Giza region is at southern latitude 84.8 • , eastern longitude 65.7 • . It is located closer to the edge of the permanent cap than Ithaca. It is also near a trough with exposure of southern polar layered deposits while the area of Giza is flat on km-scale (see HiRISE DTM DTEPC_004736_ 0950_005119_0950_A01). On the smaller scales, as can be seen in multiple HiRISE images taken over this area (including those that were input for Planet Four) the region is covered in modulated bumps and small ripples. One side of this ROI is covered in yardangs.
Very large and very intricate araneiform structures are located in this region. Their troughs are narrow, long, with high degrees of branching. These araneiforms are very active in spring: multiple long and narrow fans emerge from their troughs and cover an extended area. HiRISE detected a dusty reddish haze over the araneiforms in Giza in several years indicating active loading of dust into the lower layer of atmosphere. The directions of the fans in the late spring were previously noted to co-align with yardangs, suggesting that the wind regime in this area in summer stayed stable for an extended period of time Hansen et al. [2010].
Similar to Ithaca, in Giza we do not observe significant differences in fan directions between MY29 and MY30 (Fig. 40 lower left panel and Fig. 37). Early images taken before L s =190 • show very narrow histograms with a maximum between 300 • and 310 • . The maximum, which marks the direction of most fans, slowly shifts towards 360 • . The shift rate is higher than in Ithaca (> 45 • over the whole spring). The number statistics of fan detection worsens in the late spring in both years, but it is particularly noticeable in late spring of MY30 (see histograms for the late spring of MY30). This is explained by decreasing contrast between the fan deposits and undisturbed surface around fans in late spring images, i.e. the fans blending in with their environment.

Manhattan
The Manhattan region is in a very active area with at least 3 HiRISE ROIs that once were all considered under this same name. This area is around southern latitude 86 • , eastern longitude 99 • , as the two above, this is on the edge but still inside the cryptic region. The ROI is located on the eastern side of a South Polar Layered Deposit (SPLD) trough that in spring is completely covered with seasonal activity. The area is inclined towards the trough, i.e. in the north-west direction, however, rather insignificantly. According to the HiRISE DTM (DTEPC_022259_0935_022339_ 0935_A01), there is a 270 m elevation change over approximately 8 km (≈2°slope).
Manhattan is covered in well developed interlaced araneiforms. Similar to Giza, the araneiforms here have thin and long troughs and branch significantly. Aside from araneiforms, the surface in Manhattan is smooth, even on tens to hundreds meters scales with just several exceptions of shallow irregular pits.
Seasonal activity is extensive in Manhattan, with dark fan deposits that at times develop bright

Inca City
Inca City is at latitude 81.3 • , eastern longitude 295.7 • ; relative to the aforementioned ROIs it is on the opposite side of the permanent cap and the southern pole. The topography of this location is the most complex in our list (HiRISE DTM DTEPC_022699_0985_022607_0985_A01). It is a system of over 300 m-high ridges that crisscross each other at almost right angles forming close-torectangular basins. The slopes of the ridges sometimes exceed 13 • providing a variety of insolation environments in a relatively small region. The inner surface of the basins is flat and most of araneiforms of Inca City are carved in it. The formation of the Inca City ridge system is debated but most commonly attributed to the interaction of irregularities of the local crust with an impactinduced compaction wave [Kerber et al., 2017].
Araneiforms in Inca City are morphologically different from those in Giza and Manhattan. They have a well-developed central depression with relatively short troughs extending outwards and are on average smaller.
Seasonal activity in Inca City starts at the slopes of the ridges . Fan deposits extend downwards following gravity lines. The fans are very narrow but do not have any features of the flows (dark flows come later in spring). It is not fully clear if the fans are directed by the gravity or by downslope winds in this ROI. The surface around and near araneiforms, in the basin floor, gets covered mostly in blotches suggesting that no significant winds are active inside the basins.
Directions of fans in Inca City are seemingly disordered, particularly in comparison to the 3 ROIs discussed above. However, Inca City is special in this set because it has prominent topography that the other 3 ROIs lack. Thus the analysis method that works well for our other ROIs might not be applicable to Inca City. Inca City ridges affect the local deposition of solar energy and influence near-surface winds. Directions of fans in Ithaca, Giza, and Manhattan are modified by near surface winds that normally pass undisturbed over the whole ROI. In contrast, in Inca City fans are observed almost exclusively on the slopes of the ridges and are aligned with down-slope direction. However, these fans appear on the slopes gradually through spring: the first fans according to our analysis are pointing to the south-west direction (270 • from NA), i.e. located on south-west facing slopes. Early observations have the smallest standard deviation indicating smallest variation in the fan directions (Fig. 39). However, even in the early histograms several local maxima may be detected. The location of the secondary maxima are determined by the slopes that were covered by HiRISE image at each L s . Later in spring the fans start to appear on the slopes with a different orientation than to the south-west. This widens the histogram for each HiRISE image and makes the location of the histogram maximum a less and less relevant measure of the mean fan direction. This results in the larger variation of the mean fan direction and large standard deviations (bottom right panel of Fig. 40). Local maxima repeatedly occurring at the same directions from image to image in late spring and the whole scenario repeats in both years with only small variations.  Directions are plotted in degrees relative to NA direction. Error bars represent the standard deviation of the data and not the error on the mean. Prevailing winds control direction of fans in Ithaca, Manhattan, and Giza because the over-all topography in these ROIs is smooth and has no obstacles significantly modifying the winds. In Inca City, however, the topography is more prominent, with 3 km-high ridges that break down the general winds and support creation of katabatic flows. Thus, the fans here follow slopes of the ridges rather than wind direction, which is reflected in the large scatter of mean fan direction and large standard deviations on mean fan direction.

Conclusions
The Planet Four project has produced a catalog of 158,476 fans and 249,801 blotches (ellipses), identifying locations of seasonal surface deposits produced by the CO 2 jet processes occurring during spring in the Martian south polar region. The catalog was generated by combining the assessments made by Planet Four volunteers reviewing a set of 42,904 tiles derived from 221 HiRISE observations obtained over 2 Martian Years, covering a set of 28 regions of interest (ROI) across the south pole. To date, this catalog serves as the largest reporting of locations, sizes, and mapping of seasonal deposits on the Martian surface. The Planet Four fan and blotch catalog constitutes a resource for studying polar winds, climate and polar processes. Using south polar fans as regional wind markers, the Planet Four catalog can provide tests for and input to global and regional atmospheric circulation models.
Statistical comparisons between classifications produced by the science team and catalog results for the same image data (Section 5) demonstrate that the bulk composition of the Planet Four catalog represents a fairly complete picture of the seasonal fans and blotches captured in the HiRISE images. Trend consistency for fan directions between Mars Year 29 and 30, despite the fact that most data is being analyzed by different volunteers, further indicates reliability of the methods presented here (see summary Figure 40). We have gone into considerable detail on the methodology behind the data in the catalog and are confident that its content can be productively used by our colleagues for their own research.
For 4 of the 28 ROIs we have presented mean fan directions. In three of these, the fan deposits appear to be directly modified by near-surface winds at the time of jet eruption; the fourth ROI shows the strong influence of topography. In ROIs Ithaca, Giza, and Manhattan: The derived mean winds show no significant inter-annual variability between MY29 and MY30: their direction at the same L s are the same with less than 10°variations. In Inca City: The mean direction of the fans coincides with the direction of slopes and changes over spring while more slopes become exposed to sunlight and cold jet eruptions happen.
Our analysis in this paper focused on HiRISE observations from seasons 2 (MY29) and 3 (MY30) of the HiRISE southern seasonal processes campaign, and research into inter-annual variability starts to be feasible. However, the HiRISE campaign covers now 6 seasons of monitoring, and for a number of selected ROIs 5 of these have been or are being analyzed by the Planet Four project at the time of writing. The results from the analysis of these longer timespans and additional areal coverage will be topics of future publications and data releases. This work is also partially enabled by the National Aeronautics and Space Administration (NASA) support for the Mars Reconnaisance Orbiter (MRO) High Resolution Imaging Science Experiment (HiRISE) team. This paper includes data collected by the MRO spacecraft and the HiRISE camera, and we gratefully acknowledge the entire MRO mission and HiRISE teams' efforts in obtaining and providing the images used in this analysis. The Mars Reconnaissance Orbiter mission is operated at the Jet Propulsion Laboratory, California Institute of Technology, under contracts with NASA. The authors also thank Rod Heyd for guidance in extracting the geographic and location information for HiRISE non-map projected image. This research has made use of the USGS Integrated Software for Imagers and Spectrometers (ISIS) and of NASA's Astrophysics Data System.
KMA and GP were supported for this work by NASA ROSES Solar System Workings grant NNX15AH36G.
All software created for the pipeline is based on the open source language Python, using the matplotlib library [Hunter, 2007] for plotting, the pandas library for data wrangling and analysis [McKinney, 2010], the scikit-learn library [Pedregosa et al., 2011] for the clustering of Planet Four markings and other pre-and post-processing tasks, the IPython and Jupyter system for everday computing [Perez and Granger, 2007], and the SciPy tools on a daily basis [Jones et al., 2001]. classified before. Duplicate classifications are only a small portion of the data-set, comprising 1.9 % percent of all classifications produced, and typically, a few classifications or less per Planet Four image tile were duplicates in those cases.
In order to treat each classification as an independent assessment, we removed all duplicate classifications, keeping only the first response for a given registered user/non-logged-in session for a given cutout.
We also found a concentration of markings positioned at the top left corner (x=0, y=0) of the marking interface, with nearly all having default values for the other recorded parameters. Only 0.12 % of the 9,631,517 markings recorded for Seasons 2 and 3 are effected. Further investigation shows that less than 7 % of fan and blotch markings with default parameters with x=0 or y=0 are not centered at the origin. Thus, we believe these origin default-valued markings are due to a javascript error. Therefore, we simply delete them from the database, but keep any other markings associated with those effected classifications. Additionally 33 markings (∼0.003 % of all entries in the Planet Four classification database) do not have all of the required parameters that should have been recorded. We believe this is to due a singularity in the drawing tool for that marker, and we remove that entry from the database. There are also positions in the database recorded for a handful of fans and blotches significantly out of the bounds of the user interface. A classifier can move a marker drawn outside the edge of the image, to better capture the center position of a feature, but these positions are well outside the image region. This represents well less than 1 % of all classifications, and we have removed them from the analysis presented here. All statistics and values reported in this Paper are after the filtering described above.

Appendix C. Raw Classification Data
Here we provide additional details about the raw classification data provided in the online supplementary data file 8 . It is written in the binary HDF5 format, in the variant produced by the pandas library (supported by the PyTables library 9 ).
The general structure is as follows: Each classification submission by an individual volunteer creates a classification_id. All objects created by this volunteer receives the same classification_id, with the marking data for each object being one entry in the classification database. Each data row also has a marking column that identifies if this data is for a fan, a blotch, an interesting feature that will have the string value "interesting" in the marking column, or "none", when the volunteer did not create any marking object. Below we describe the columns available in this database: The Planet Four classification interface recorded a different angle than the intended spread angle from the fan marking tool. This was identified and subsequently fixed in the software. The correct spread angle is recoverable from the values stored in the database. We denote those markings generated before the patch with version flag set to 1.0 and those after with the version flag set to 2.0. We provide the corrected spread angle for the fans affected, but leave that version flag in the final catalog, for reference. To gather statistics on the understanding of the tutorial, the Planet Four classification database contains all the tutorial markings, indicated by a HiRISE image name of 'tutorial'. For the delivered raw classification database, the fan angles range has been converted from -180-180 to 0-360, while the range of the blotch angles have been converted to 0-180, due to their rotational symmetry.

Appendix D.2. Pipeline stage levels Appendix D.2.1. Level 1A
Level 1A is the data that is directly output from clustering and averaging the cluster members into average markings, as described in Section 4.2. Here, the biggest reduction in terms of numbers of objects in the system occurs, as all the different volunteers data are being combined into one object when the clustering process has determined the markings to be part of one cluster. All newly created average fans and blotches are summarized into one fan and blotch summary file respectively, which each line representing the mean object from averaging all cluster members. As an example, the content of APF0000p3q_L1A_fans.csv is shown below. When the column name matches those given in Appendix Appendix C, they have the same meaning. The two new columns are n_votes, which records how many members the cluster had that was used to produce this averaged object, and marking_id, which have been created at this stage of the pipeline and serve as a tracer throughout the different pipeline outputs: Additionally, each L1A folder contains a text file called clustering_setttings.yaml that summarizes the clustering settings used for these data for reference. epsilon values are static and all the same, but the min_samples value is dynamically calculated, see Section 4.2.1 for details.

Appendix D.2.2. Level 1B
At level 1B, the combination pipeline has determined with objects are so close to each other that they should be considered for merging (see Section 4.3). The outputs are between one and three files this time. One only, in case all fans and blotches found were so close that they need to be evaluated by their classification votes. Usually, though, there are two to three files, where one files stores the objects that need voting, and the other file(s) store the objects that don't have any close neighbors and will simply be copied over to the final level later. The fans and blotches in these latter files will receive the 'vote_ratio' value of 1.0, indicating that they had a "perfect" probability for being a fan, or blotch, respectively. The third file that keeps the close objects for the later thresholding contains these temporary meta-objects in sets of 2 rows, one fan and one blotch, and has the term "fnotch" in its filename (fnotches: FaN-blOTCH). This file contains all the clustering statistics data from L1A required to make a cut decision for L1C, with the data for each meta-object being sorted in alternating rows. Here are the first four rows of the fnotch file APF0000any_L1B_fnotches.csv:  This data stage L1B is what can be used to create a different significance threshold cut for the final data , by filtering on the data column vote_ratio in the fnotch file for the required threshold value. For example, if a higher threshold on the probability for a fan is wanted, e.g. 0.8, one would filter out all rows that start with "fan" with a vote_ratio value below 0.8. One then needs to decide if one wants to use this threshold as a general "certainty" filter and simply don't take any object with a vote_ratio < 0.8, or if one wants the blotch to appear instead of a fan.

Appendix D.2.3. Level 1C
This level contains the data of the final catalog files, but split-up into each Planet Four tiles. At the end of the thresholding stage (Section 4.3), appending the data for the rows that pass the threshold filters into the respective blotch and fan files and copying these completed files into the L1C directory completes that thresholding step and fills up the L1C folders. A final tool walks through each folder and collects all the fan and blotch data into one summary file each, followed by merge operations with meta-data that is useful for future analysis. These files are described in the next section, Appendix E.

Appendix F. Extended validation results
In addition to the combined fan and blotch count we explored in Section 5, we further explore here how well the Planet Four catalog identifies fans (those dark sources with a clear direction and starting point) versus blotches, separately. We separate the catalog and gold standard classifications by marker type in Figures F.41 to F.44. The data processing pipeline plays a significant role in the completeness of the catalog. At the Thresholding stage, our data processing algorithm determines which clusters will ultimately become fans with a value of P(fan) > 0.5. Like for the total number of sources, the number distribution of fans and the number distribution of blotches matches the expert assessments and is within the 3-σ uncertainty [Kraft et al., 1991]. Thus, in most cases where the science team member marked a fan, the catalog also identifies this source as fan. Based on these results, we have high confidence in our fan and blotches identifications within the Planet Four catalog. Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 85, omitting single entry bins above.
Appendix F.1. Example tile comparisons In Figures F.45 and F.46 we show an example comparison of volunteer's markings with those performed by the science team. The aforementiend slight deviations of the science team members with each other is visible, however, it is clear that the catalog wind directions in Fig. F.45 are well reproduced by both the specialists and the volunteers. The results for blotches in Fig. F.46 are very comparable, with the added simplification that blotches have a much reduced directivity compared to fans.