{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jmGrD1tC1nEr"
      },
      "source": [
        "---\n",
        "# Python resources for the Virtual Observatory\n",
        "## Some use cases of astroquery, Vizier and PyVO\n",
        "---\n",
        "\n",
        "**Authored by** Ricardo Rizzo (ISDEFE, SVO collaborator)\n",
        "\n",
        "**Revised/approved:** [Spanish Virtual Observatory](https://svo.cab.inta-csic.es)\n",
        "\n",
        "**Creation date:** October 25, 2025\n",
        "\n",
        "**Updated:** November, 2025\n",
        "\n",
        "This tutorial presents an introductory use of Python tools aimed to profit the Virtual Observatory (VO) capabilities.\n",
        "\n",
        "The tutorial assumes that the user:\n",
        "- has sufficient knowledge of Python 3.\n",
        "- is able to manage python notebooks and Google colab platform.\n",
        "- has an overall concept about the main VO characteristics, particularly about archives, applications and query tools.\n",
        "- does **not** need previous knowledge about existing catalogs or other VO resources.\n",
        "---\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "R2-tUtjqRFIf"
      },
      "source": [
        "# Overview\n",
        "\n",
        "The tutorial focuses on the use of the packages ```Vizier```, ```astroquery```, and ```PyVO```. These packages are considered a bridge between the VO resources and the powerful and versatile Python language.\n",
        "\n",
        "It follows a \"_learning by doing_\" approach and is structured around a series of examples which make use of the necessary packages.\n",
        "\n",
        "Therefore, it is **NOT** a complete reference to all the capabilities of the packages, but a starting guide to use them. Some external references are included when appropriate. In some cases, it is expected not so happy results, aimed to show the \"real life\" when an astronomer works with not fully standard databases."
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Why using VizieR and vizier\n",
        "- VizieR (https://vizier.cds.unistra.fr/) provides the most complete library of published astronomical catalogues (25800+ up to November 2025).\n",
        "- It is a \"live\" repository, with verified and enriched data from surveys, catalogs and tables.\n",
        "- Its content is accessible by multiple online interfaces and desktop applications.\n",
        "- There are a number of query tools that allow the user to select relevant data tables and to extract records matching a given criteria.\n",
        "- It has several mirrors worldwide.\n",
        "- Most of its capabilities are integrated in ```vizier```, an affiliated package of ```astroquery```.\n",
        "- VizieR (and vizier) usually is the starting point of a new research.\n",
        "- Ideal when have little information about a topic or a source.\n",
        "- Reference: [https://astroquery.readthedocs.io/en/latest/vizier/vizier.html](https://astroquery.readthedocs.io/en/latest/vizier/vizier.html)"
      ],
      "metadata": {
        "id": "J-H6pgQs6RW8"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## What about astroquery\n",
        "- astroquery is **THE** query tool in Python for astronomy.\n",
        "- It has an excellent management of tables in all their formats, particularly VOTables.\n",
        "- astroquery includes easy access to VizieR functionalities through its convenience functions, which are added to the astroquery.vizier methods and functions.\n",
        "- Reference: [https://astroquery.readthedocs.io/en/latest/](https://astroquery.readthedocs.io/en/latest/)"
      ],
      "metadata": {
        "id": "5AiZyrmC6TmL"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## When to go to PyVO\n",
        "- PyVO works as an \"extension\" of astropy, the most popular collection of software packages written in Python for use in astronomy.\n",
        "- PyVO has an API for most VO protocols.\n",
        "- Its use is recommended when a VO service is not registered or not fully VO-compliant.\n",
        "- When used in combination with numpy, pandas, matplotlib, etc., other high functionalities are possible.\n",
        "- Reference: [https://github.com/astropy/pyvo](https://github.com/astropy/pyvo)"
      ],
      "metadata": {
        "id": "kNv4pSJB6Qgj"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ehthEjb_NlrM"
      },
      "source": [
        "---\n",
        "# Install the necessary packages\n",
        "\n",
        "The packages needed for this tutorial are indicated in the file ```requirements.txt```.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ry2amL9TQBmu"
      },
      "outputs": [],
      "source": [
        "# Download the list of needed packages from the Zenodo tutorial\n",
        "! wget -qO requirements.txt https://zenodo.org/api/records/17533386/files/requirements.txt\n",
        "# Load requirements.txt from owner's drive\n",
        "! pip install -r requirements.txt"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The following packages are necessary:\n",
        "```\n",
        "astropy\n",
        "astroquery\n",
        "pyvo\n",
        "numpy\n",
        "pandas\n",
        "matplotlib\n",
        "rich\n",
        "requests\n",
        "```\n",
        "Very probably, the colab session will only install ```pyvo``` and ```astroquery```. The other packages are included here in order to guarantee a correct run in case of exporting the notebook to another environment."
      ],
      "metadata": {
        "id": "WKGNkDExmhlM"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cw_nDUcoVCjI"
      },
      "source": [
        "# Using VizieR in Python: a cone search\n",
        "\n",
        "Below we develop two examples which perform a Cone Search query to VizieR using astroquery."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mlKXXEtsZDki"
      },
      "source": [
        "## Packages, definitions, setups\n",
        "Before doing the queries, we import the necessary packages, define the target source and create the query instance."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hXxY6cyR1heC"
      },
      "outputs": [],
      "source": [
        "# Import necessary modules\n",
        "from astroquery.vizier import Vizier\n",
        "from astropy.coordinates import SkyCoord\n",
        "import astropy.units as u\n",
        "from astropy.table import Table\n",
        "from rich import print"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "o5vX0A20Wg4L"
      },
      "outputs": [],
      "source": [
        "# Define the target (play and change to your favorite source)\n",
        "TARGET_NAME = \"NGC2359\"  # a Wolf Rayet Nebula"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "spotSOfoWjcQ"
      },
      "outputs": [],
      "source": [
        "# Clear the local cache\n",
        "Vizier.clear_cache()\n",
        "\n",
        "# Create a Vizier instance\n",
        "vizier = Vizier(columns=['*'])  # Retrieve all columns"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4Clud2pgWjuu"
      },
      "outputs": [],
      "source": [
        "# Optionally, it is possible to select a mirror\n",
        "us_vizier = Vizier(vizier_server=\"vizier.cfa.harvard.edu/viz-bin/VizieR-2\")\n",
        "# [[See help(Vizier) for other parameters]]"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "> **⚠️ Warning**\n",
        "\n",
        "> As it is specified, the instance ```vizier``` will do the query in all the catalogs (more than 20k!). This may take several seconds, or even minutes, to finish. The user may want to specify the catalogs to be searched by specifying them explicitly or by taking advantage of the [CDS categories](https://vizier.cds.unistra.fr/vizier/doc/catstd-2.htx?utm_source=chatgpt.com) (for example, adding ```, catalog=\"II/*\"``` to the instance in case of looking for photometry surveys).\n"
      ],
      "metadata": {
        "id": "zDNY09TAMnkF"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6xkUipwtW_Hj"
      },
      "source": [
        "## Example 1: query the available catalogs around the target\n",
        "| Key function/method  | Description |\n",
        "|---------------|-------------|\n",
        "| ```vizier.query_object``` | Queries the service by internally resolving coordinates of the input source name.\n",
        "||Returns a table object.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GoGffz29XRFW"
      },
      "outputs": [],
      "source": [
        "# Do the query\n",
        "catalogs = vizier.query_object(TARGET_NAME)\n",
        "print(f\"Number of catalogs found around {TARGET_NAME}: {len(catalogs)}\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ezm6KP7QXRIL"
      },
      "outputs": [],
      "source": [
        "# Print the catalog codes and number of entries\n",
        "for catalog in catalogs:\n",
        "    print(f\"Catalog: {catalog.meta['name']}, Number of entries: {len(catalog)}\")\n",
        "# [[Note the difference w.r.t. pyvo cases]]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Y2DA9EhecwJA"
      },
      "source": [
        "## Example 2: perform a Cone Search around the target coordinates\n",
        "| Key function/method  | Description |\n",
        "|---------------|-------------|\n",
        "| ```vizier.query_region``` | Queries the service in a region centered at some coordinates and with a given radius.\n",
        "||Returns a table object."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ECh-F2NUdOgD"
      },
      "outputs": [],
      "source": [
        "# Do the query\n",
        "target_coords = SkyCoord.from_name(TARGET_NAME)\n",
        "SEARCH_RADIUS = 5 * u.arcsec\n",
        "catalogs = vizier.query_region(target_coords, radius=SEARCH_RADIUS)\n",
        "if catalogs != None:\n",
        "    print(f\"Number of catalogs found in Cone Search around {TARGET_NAME}: {len(catalogs)}\")\n",
        "else:\n",
        "    print(f\"No catalogs found in Cone Search around {TARGET_NAME}.\")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Print the catalog codes and number of entries\n",
        "if catalogs != None:\n",
        "    for catalog in catalogs:\n",
        "        print(f\"Catalog: {catalog.meta['name']}, Number of entries: {len(catalog)}\")\n"
      ],
      "metadata": {
        "id": "LLivWv1sQd2K"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# More VizieR: Query keywords, table downloading and writing\n",
        "\n",
        "In this example we do a query by specifying descriptors which will match the keywords provided for the catalogs.\n",
        "\n",
        "Additionally, we select, download and inspect one of the catalogs. Finally the catalog is written as a VOTable.\n",
        "\n"
      ],
      "metadata": {
        "id": "4-m8qVuaSfLy"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Example 3: work with catalogs containing data related to pulsar rotation\n",
        "\n",
        "| Key function/method  | Description |\n",
        "|----------------------|-------------|\n",
        "| ```vizier.find_catalogs``` | Queries all catalogs matching some keywords.\n",
        "| ```vizier.get_catalogs```  | Retrieves a list of catalogs.\n",
        "| ```table.write```          | Writes a table in several formats.\n"
      ],
      "metadata": {
        "id": "9z7Z37isUX-c"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Find all the catalogues matching the keywords."
      ],
      "metadata": {
        "id": "pYb8jIwSgubn"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qYPm7DcoVBpa"
      },
      "outputs": [],
      "source": [
        "# Clear the local cache\n",
        "Vizier.clear_cache()\n",
        "\n",
        "# Create a Vizier instance\n",
        "vizier = Vizier(columns=['*'])  # Retrieve all columns\n",
        "\n",
        "# Search for catalogs related to pulsar rotation\n",
        "SEARCH_TERM = 'pulsars rotation' # Play and change it by your favourite topic!\n",
        "pulsar_list = vizier.find_catalogs(SEARCH_TERM)\n",
        "\n",
        "# Number of catalogs found\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"Searching for catalogs with term: '{SEARCH_TERM}'\")\n",
        "print(f\"Number of catalogs found: {len(pulsar_list)}\")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Print catalog codes and their descriptions\n",
        "print(\"-------------------------------------------------------------\")\n",
        "for k,v in pulsar_list.items():\n",
        "   print(k, \":\", v.description)"
      ],
      "metadata": {
        "id": "1lj3sWeOWbvf"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Select one of the catalogs and inspect their tables."
      ],
      "metadata": {
        "id": "UFwXeRHpebRp"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Get a specific catalog by its code\n",
        "CATALOG_CODE = 'J/A+A/684/A208'\n",
        "\n",
        "# Limit the number of rows to retrieve (optional)\n",
        "vizier.ROW_LIMIT = 1 # Set to the minimum for faster retrieval\n",
        "\n",
        "# Retrieve the catalog\n",
        "catalog = vizier.get_catalogs(CATALOG_CODE) # Retrieving may take some time\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"Retrieving catalog: {CATALOG_CODE}\")\n",
        "print(f\"Number of tables in catalog: {len(catalog)}\")\n",
        "print(catalog)"
      ],
      "metadata": {
        "id": "oZHIdWZteVjC"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Work with the first table of the catalog."
      ],
      "metadata": {
        "id": "YGE9QoSek-OV"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Access to the first table (still with just one row)\n",
        "table = catalog[0]\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"First table in catalog '{CATALOG_CODE}':\")\n",
        "print(table)"
      ],
      "metadata": {
        "id": "FqPNVnLZlFro"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Access to the columns\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(\"Columns in the table:\")\n",
        "print(table.colnames)\n",
        "\n",
        "# Select a specific column and print its description\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"Description of column 'FeRASS':\\n{table['FeRASS'].description}\")"
      ],
      "metadata": {
        "id": "p8T2dtgBlO9H"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Access to the data in the table (download and write as VOTable)."
      ],
      "metadata": {
        "id": "pe7IlWYzlnFM"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Access to data (all rows)\n",
        "vizier.ROW_LIMIT = -1 # All available rows\n",
        "#OPTION vizier.TIMEOUT = 500  # Increase timeout to 500 sec\n",
        "catalog = vizier.get_catalogs(CATALOG_CODE)\n",
        "table = catalog[0]\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"Number of rows: {len(table)}\")\n",
        "print(f\"Data in selected table, from catalog {CATALOG_CODE}:\")\n",
        "print(table)\n",
        "\n",
        "# Write table to a local VOtable\n",
        "outfile = f\"{SEARCH_TERM.replace(' ','_')}.vot\"\n",
        "table.write(outfile, format='votable', overwrite=True)\n",
        "print(\"-------------------------------------------------------------\")\n",
        "print(f\"Catalog saved to local file/s {outfile}\")"
      ],
      "metadata": {
        "id": "fi_qwWNilfzV"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Life beyond VizieR: a Cone Search using PyVO\n",
        "\n",
        "In the following example we mine the Gaia DR3 catalog around the nearby Pleiades star cluster.\n",
        "\n",
        "Through basic table manipulation, we construct a color-magnitude diagram, plot the proper motion vectors and select candidate Pleiades members based on them.\n",
        "\n",
        "> **📝 Note:**\n",
        "\n",
        "> _This science case will be fully developed in this School using Topcat._"
      ],
      "metadata": {
        "id": "LDpW7lVsaWrb"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Packages, service, coordinates\n",
        "\n",
        "Before doing the query, we import the necessary packages, define the service and get the target coordinates."
      ],
      "metadata": {
        "id": "OmwbFwODbuvR"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Import packages\n",
        "import pyvo\n",
        "import pandas as pd\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "from astropy.table import Table\n",
        "from astropy import units as u\n",
        "from astropy.coordinates import SkyCoord, get_icrs_coordinates"
      ],
      "metadata": {
        "id": "VPk2p4VLcOop"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Definition of service (protocol and URL)\n",
        "ACCESS_URL = \"https://gea.esac.esa.int/tap-server/conesearch?TABLE=gaiadr3.gaia_source&IDCOL=source_id&RACOL=ra&DECCOL=dec&\"\n",
        "service = pyvo.dal.SCSService(ACCESS_URL)"
      ],
      "metadata": {
        "id": "DlHUz5VZd3Yi"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Find coordinates of a source from its name\n",
        "pleiades_coord = get_icrs_coordinates('M45')\n",
        "\n",
        "# This convenience function may also be used\n",
        "pleiades_coord = SkyCoord.from_name('M45')\n",
        "print(f\"Pleiades coordinates (ICRS): RA={pleiades_coord.ra.degree}, Dec={pleiades_coord.dec.degree}\")"
      ],
      "metadata": {
        "id": "-UAtekQqeZ45"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Example 4: query the Gaia DR3 main catalog and manipulate results\n",
        "\n",
        "| Key function/method  | Description |\n",
        "|----------------------|-------------|\n",
        "| ```pyvo.dal.SCSService``` | Definition of a Cone Search service. |\n",
        "| ```service.search``` | Do a Cone Search query around a target position. |\n",
        "| ```pyvo.conesearch``` | Convenience function to avoid defining the service. |"
      ],
      "metadata": {
        "id": "Ia-CIjeonnGW"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Do the query and print the number of stars found."
      ],
      "metadata": {
        "id": "uV6pUlcwpbZ8"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Do the query\n",
        "target_ra = pleiades_coord.ra.degree\n",
        "target_dec = pleiades_coord.dec.degree\n",
        "radius = 1.    # also in [deg]\n",
        "results = service.search((target_ra, target_dec), size=(radius, radius))\n",
        "\n",
        "# This convenience function do the search in a circle:\n",
        "#results = pyvo.conesearch(ACCESS_URL, pos=[target_ra, target_dec], radius=radius)\n",
        "\n",
        "# Results works similarly to an astropy Table\n",
        "print(f\"Number of results: {len(results)}\")"
      ],
      "metadata": {
        "id": "I0o9OHiJphwV"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Table manipulation and conversion."
      ],
      "metadata": {
        "id": "zHp481p7sGEo"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Access to fields/columns and data\n",
        "results_table = results.to_table()\n",
        "print(results_table.colnames)  # List of column names\n",
        "print(results_table['source_id'])  # Access to a specific column\n",
        "\n",
        "# Convert to VOTable\n",
        "results_vot = results.votable\n",
        "\n",
        "# Print all the fields (i.e. columns)\n",
        "for field in results_vot.iter_fields_and_params():\n",
        "    print(f\"Field/Param: {field.name}\")"
      ],
      "metadata": {
        "id": "IEt7dqJRsKrK"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Save results as a VOTable."
      ],
      "metadata": {
        "id": "dh3OgnsBsXUP"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Save results to a local VOTable file\n",
        "results_vot.to_xml('pleiades_gaia_dr3.vot')"
      ],
      "metadata": {
        "id": "zEZhTd0QskDX"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Create a color-magnitude diagram."
      ],
      "metadata": {
        "id": "c5YtbC6-tI2v"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Convert to pandas DataFrame\n",
        "df = results.to_table().to_pandas()\n",
        "print(df.head())\n",
        "\n",
        "# Plot 1: G magnitude vs. BP-RP color\n",
        "plt.figure(figsize=(8,6))\n",
        "plt.scatter(df['bp_rp'], df['phot_g_mean_mag'], s=1, color='blue')\n",
        "plt.gca().invert_yaxis()\n",
        "plt.xlabel('BP - RP [mag]')\n",
        "plt.ylabel('G mag [mag]')\n",
        "plt.title('Color-Magnitude Diagram of Pleiades (Gaia DR3). N={}'.format(len(df)))\n",
        "plt.savefig('pleiades_cmd.pdf', dpi=300)\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "CGESGwhKtMBj"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Plot proper motion vectors in the sky."
      ],
      "metadata": {
        "id": "Gbv6-Hc-tlBO"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Plot 2: Proper motion vector field (using quiver with scaling)\n",
        "plt.figure(figsize=(8,8))\n",
        "scale_factor = 3e-4  # Adjust this factor to scale the arrows\n",
        "plt.quiver(df['ra'], df['dec'], df['pmra']*scale_factor, df['pmdec']*scale_factor, angles='xy', scale_units='xy', scale=1, color='red', width=0.002)\n",
        "plt.xlabel('RA [deg]')\n",
        "plt.ylabel('Dec [deg]')\n",
        "plt.title('Proper Motion Vector Field of Pleiades (Gaia DR3). N={}'.format(len(df)))\n",
        "plt.axis('equal')\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "nVlS0v9dt-lA"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [],
      "metadata": {
        "id": "v6-I5HpMu6EP"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Diagram of proper motion values in RA, Dec.\n",
        "\n",
        "_Note the group of points around (20, -45) mas/yr._"
      ],
      "metadata": {
        "id": "mrvY8CxyuS4L"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Plot 3: Proper motions in RA vs. Dec\n",
        "plt.figure(figsize=(8,6))\n",
        "plt.scatter(df['pmra'], df['pmdec'], s=1, color='blue')\n",
        "plt.xlabel('pmra [mas/yr]')\n",
        "plt.ylabel('pmdec [mas/yr]')\n",
        "plt.xlim(-50, 70)\n",
        "plt.ylim(-80, 40)\n",
        "plt.title('Proper Motions of Pleiades (Gaia DR3). N={}'.format(len(df)))\n",
        "plt.grid()\n",
        "ax = plt.gca()\n",
        "ax.set_aspect('equal', adjustable='box')\n",
        "plt.savefig('pleiades_pm.pdf', dpi=300)\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "c7OGw7FEuf5T"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Create a group of comoving sources around (20, -45) mas/yr."
      ],
      "metadata": {
        "id": "fozUVkc1vCiT"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Select comoving sources\n",
        "pm_mask = (\n",
        "    (df['pmra'] > 15) &\n",
        "    (df['pmra'] < 25) &\n",
        "    (df['pmdec'] > -50) &\n",
        "    (df['pmdec'] < -40)\n",
        ")\n",
        "\n",
        "comoving = df[pm_mask]\n",
        "print(f\"Number of comoving sources: {len(comoving)}\")\n",
        "print(comoving[['source_id', 'ra', 'dec', 'pmra', 'pmdec', 'phot_g_mean_mag']])"
      ],
      "metadata": {
        "id": "rPf7WoztvQYT"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Repeat the color-magnitude diagram for the whole sample and the comoving stars."
      ],
      "metadata": {
        "id": "hq-sdoszvitA"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Repeat plot 1 (a CMD) for df and comoving sources\n",
        "plt.figure(figsize=(8,6))\n",
        "plt.scatter(df['bp_rp'], df['phot_g_mean_mag'], s=1, color='blue', label=\"All sources\")\n",
        "plt.scatter(comoving['bp_rp'], comoving['phot_g_mean_mag'], s=1, color='red', label=\"Comoving sources\")\n",
        "plt.gca().invert_yaxis()\n",
        "plt.xlabel('BP - RP [mag]')\n",
        "plt.ylabel('G mag [mag]')\n",
        "plt.title('Color-Magnitude Diagram of Comoving Sources in Pleiades (Gaia DR3). N={}'.format(len(comoving)))\n",
        "plt.savefig('pleiades_comoving_cmd.pdf', dpi=300)\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "hLotI7eVvuVL"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# More PyVO: access to images\n",
        "\n",
        "In this example we use pyVO to perform a Simple Image Access (SIA) query.\n"
      ],
      "metadata": {
        "id": "oz7AiPhGmyov"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Example 5: search and download an image\n",
        "\n",
        "| Key function/method  | Description |\n",
        "|----------------------|-------------|\n",
        "| ```pyvo.regsearch``` | Do a query consulting the VO Registry. |\n",
        "| ```pyvo.dat.SIAService``` | Definition of a Simple Image Access query |"
      ],
      "metadata": {
        "id": "JKf2FqVsdZQD"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Import the necessary packages"
      ],
      "metadata": {
        "id": "1j-RRHWSIyLS"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Import packages\n",
        "import pyvo\n",
        "from pyvo.dal.sia import SIAResults\n",
        "import requests\n",
        "import pandas as pd\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "from astropy.table import Table\n",
        "from astropy import units as u\n",
        "from astropy.coordinates import SkyCoord, get_icrs_coordinates\n",
        "from astropy.io import fits\n",
        "from astropy.wcs import WCS\n",
        "from astropy.io.votable import from_table\n",
        "import subprocess"
      ],
      "metadata": {
        "id": "wu5E8C-lnAU6"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Define a function to plot an image from a 2D FITS file."
      ],
      "metadata": {
        "id": "zFnTU-XYlYm9"
      }
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "590ee9e7"
      },
      "source": [
        "# Function to plot a 2D FITS image\n",
        "\n",
        "def plot_FITS(filename):\n",
        "    from pathlib import Path\n",
        "    from astropy.io import fits\n",
        "    from astropy.wcs import WCS\n",
        "    import matplotlib.pyplot as plt\n",
        "    from astropy.visualization import ZScaleInterval, ImageNormalize\n",
        "\n",
        "    # Open FITS file\n",
        "    hdu = fits.open(filename)[0]\n",
        "    data = hdu.data\n",
        "    header = hdu.header\n",
        "\n",
        "    # Get WCS information\n",
        "    wcs = WCS(header)\n",
        "\n",
        "    # Compute zscale normalization (automatic intensity range)\n",
        "    interval = ZScaleInterval()\n",
        "    vmin, vmax = interval.get_limits(data)\n",
        "    norm = ImageNormalize(vmin=vmin, vmax=vmax)\n",
        "\n",
        "    # Plot the FITS image with WCS projection\n",
        "    fig = plt.figure()\n",
        "    ax = fig.add_subplot(111, projection=wcs)\n",
        "    im = ax.imshow(data, origin='lower', cmap='viridis', norm=norm)\n",
        "\n",
        "    # Add coordinate grid and labels\n",
        "    #ax.coords.grid(color='cyan', ls='--', lw=0.7, alpha=0.7)\n",
        "    ax.set_xlabel('Right Ascension')\n",
        "    ax.set_ylabel('Declination')\n",
        "\n",
        "    # Custom formatter for the bottom-right status bar\n",
        "    def format_coord(x, y):\n",
        "        try:\n",
        "            ra, dec = wcs.wcs_pix2world([[x, y]], 0)[0]\n",
        "            return f\"RA={ra:.6f}°  Dec={dec:.6f}°   Pixel=({x:.1f},{y:.1f})\"\n",
        "        except Exception:\n",
        "            return f\"Pixel=({x:.1f},{y:.1f})\"\n",
        "\n",
        "    ax.format_coord = format_coord\n",
        "\n",
        "    plt.colorbar(im, ax=ax, orientation='vertical', label='Intensity')\n",
        "    plt.title(Path(filename).name)\n",
        "    plt.savefig(filename.replace('fits','pdf'))\n",
        "    print(f\"Image saved to {filename.replace('fits','pdf')}\")\n",
        "    plt.show()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Look for available services in the VO Registry."
      ],
      "metadata": {
        "id": "Cgh5gc9OnD1B"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# It is possible to look for all the available (i.e. registered) SIA services\n",
        "services = pyvo.regsearch(servicetype='sia', keywords=['image'])\n",
        "print(f\"Number of SIA services found: {len(services)}\")"
      ],
      "metadata": {
        "id": "zjkLpMn5nl9c"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Too many results. Let's refine the search to the collections hosted by NOIRLab Data Lab\n",
        "services = pyvo.regsearch(servicetype='sia', keywords=['noirlab'])\n",
        "print(f\"Number of SIA services found for NOIRLab: {len(services)}\")"
      ],
      "metadata": {
        "id": "fMoDFlm5nq39"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "for pos, row in enumerate(services.to_table()):\n",
        "    print(f\"\\nService {pos}: {row['res_title']}\")\n",
        "    if 'access_urls' in row.colnames:\n",
        "        print(f\"URL: {row['access_urls']}\")\n",
        "    else:\n",
        "        print(\"No access_url found.\")"
      ],
      "metadata": {
        "id": "tarCGU99rZKm"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "> **⚠️ Warning**\n",
        "\n",
        "> The field found as ```access_urls``` in colab may appear as ```access_url``` if you run the code in a terminal.\n",
        "\n",
        "> _Take it into account in your daily work._"
      ],
      "metadata": {
        "id": "tmmomILhuIRe"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Select the \"coadd images\" service (10th in the list)"
      ],
      "metadata": {
        "id": "LmgxFpO6vvp1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Select coadded images from NOIRLab Data Lab\n",
        "coadd_service = services.to_table()[9]\n",
        "print(f\"Selected service: {coadd_service['res_title']}\")\n",
        "SERVICE_URL = coadd_service['access_urls']\n",
        "service = pyvo.dal.SIAService(SERVICE_URL)\n",
        "print(f\"URL: {SERVICE_URL}\")"
      ],
      "metadata": {
        "id": "uv-oF_zlv5y8"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Find the coordinates and do the query."
      ],
      "metadata": {
        "id": "nYzv0hGGwIoG"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Find coordinates of a source from its name\n",
        "TARGET = 'Horsehead Nebula'\n",
        "target_coords = get_icrs_coordinates(TARGET)\n",
        "# Convenience:\n",
        "target_coords = SkyCoord.from_name(TARGET)\n",
        "target_ra = target_coords.ra.degree\n",
        "target_dec = target_coords.dec.degree\n",
        "radius = 0.25*u.deg    # radius may be specified using astropy units\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(f\"Target coordinates: (RA, Dec)=({target_coords.ra.degree:5f}, Dec={target_coords.dec.degree:5f}). Search radius: {radius}\")"
      ],
      "metadata": {
        "id": "lvygxS7X5FKI"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Do the query\n",
        "images = service.search(target_coords, size=(radius,radius))\n",
        "\n",
        "# results works similarly to an astropy Table\n",
        "print(f\"Number of images: {len(images)}\")"
      ],
      "metadata": {
        "id": "gu0rAYJ95KiP"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Inspect the results."
      ],
      "metadata": {
        "id": "iIarSrNN54UK"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Inspect the column names\n",
        "images_table = images.to_table()\n",
        "print(images_table.colnames)  # List of column names"
      ],
      "metadata": {
        "id": "ZGSyP_gf5-8y"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Note that not all the 'prodtype' are 'image'.\n",
        "prodtypes = set(images_table['prodtype'])\n",
        "print(f\"Product types in the results: {prodtypes}\")"
      ],
      "metadata": {
        "id": "Ny_3muNM6a9y"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Convert to VOTable and save results."
      ],
      "metadata": {
        "id": "3u2hRG_p6uxq"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Convert to VOTable\n",
        "images_vot = images.votable\n",
        "\n",
        "# Print all the fields (i.e. columns)\n",
        "for field in images_vot.iter_fields_and_params():\n",
        "    print(f\"Field/Param: {field.name}\")\n",
        "\n",
        "# Save query results to a local VOTable file\n",
        "images_table.write('images.vot', format='votable', overwrite=True)\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(\"Results saved to images.vot\")"
      ],
      "metadata": {
        "id": "wmGhcsTO63iA"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Download and plot one of the images."
      ],
      "metadata": {
        "id": "44w6hZEG7asy"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Access and download one image\n",
        "horsehead = images[21]\n",
        "image_url = horsehead.getdataurl()\n",
        "print(f\"Selected image data URL: {image_url}\")\n",
        "response = requests.get(image_url)\n",
        "with open('horsehead.fits', 'wb') as f:\n",
        "    f.write(response.content)\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(\"Selected image downloaded as horsehead.fits\")"
      ],
      "metadata": {
        "id": "7_GVpjQ6zhis"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "plot_FITS('horsehead.fits')"
      ],
      "metadata": {
        "id": "0HVOpE4p128_"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Diving deep inside VO: the Registry\n",
        "Most of the Virtual Observatory resources are available through the VO Registry: a compilation of metadata and applications to access them.\n",
        "\n",
        "In this example we do a query of spectroscopic data by a direct request to the Registry."
      ],
      "metadata": {
        "id": "GUQFt9Qo6hjc"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Example 6: explore PyVO capabilites searching the Registry for spectra.\n",
        "\n",
        "| Key function/method  | Description |\n",
        "|----------------------|-------------|\n",
        "| ```pyvo.regsearch``` | Convenience function to do a query to the VO Registry. |\n",
        "| ```pyvo.dat.SSAService``` | Definition of a Simple Spectral Access query |"
      ],
      "metadata": {
        "id": "gw50C_SHEukO"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Import the necessary packages"
      ],
      "metadata": {
        "id": "R6PrI3H9LdOb"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import pyvo\n",
        "from pyvo import registry\n",
        "from rich import print\n",
        "import requests\n",
        "import matplotlib.pyplot as plt\n",
        "from astropy.table import Table\n",
        "from astropy.coordinates import SkyCoord"
      ],
      "metadata": {
        "id": "HvLHHIULLgTj"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Start by searching all the catalogs available through the Simple Spectral Access (SSA) protocol."
      ],
      "metadata": {
        "id": "pLXIkvGBKcLe"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Search for Simple Spectral Access (SSA) services using pyVO regsearch convenience function\n",
        "services = pyvo.regsearch(servicetype='ssa')\n",
        "print(f\"Number of Simple Spectral Access services found: {len(services)}\")"
      ],
      "metadata": {
        "id": "QhEN08avKaCv"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Too many results.\n",
        "\n",
        "Let's refine the search to services related to LAMOST"
      ],
      "metadata": {
        "id": "ZLQlp32NKyxV"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Search only to LAMOST using PyVO registry method\n",
        "ssa_lamost_services = registry.search(servicetype=\"ssa\", keywords=\"LAMOST\")\n",
        "type(ssa_lamost_services)\n",
        "\n",
        "# Print number of services found\n",
        "print(f\"Number of SSA services related to LAMOST: {len(ssa_lamost_services)}\")"
      ],
      "metadata": {
        "id": "RP5IZlyAK5RJ"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Look at some details about the services found."
      ],
      "metadata": {
        "id": "DMKNh617M8d8"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Title of the services and URLs to access them\n",
        "pos = 0\n",
        "for s in ssa_lamost_services:\n",
        "    print(f\"Service {pos}: {s['res_title']}\")\n",
        "    try:\n",
        "        print(f\"Access URLs: {s['access_urls']}\")\n",
        "    except KeyError:\n",
        "        print(\"No access_urls found.\")\n",
        "    pos += 1\n"
      ],
      "metadata": {
        "id": "t77udu1CNHe6"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Select the last service (LAMOST DR6)"
      ],
      "metadata": {
        "id": "bXWhEmf2NxUj"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Select the last service (LAMOST DR6)\n",
        "lamost_dr6 = ssa_lamost_services[-1]\n",
        "print(f\"Selected LAMOST DR6 MRS service title: {lamost_dr6['res_title']}\")\n",
        "\n",
        "for col,val in lamost_dr6.items():\n",
        "    print(f\"{col}: {val}\")"
      ],
      "metadata": {
        "id": "Tbobn4p6N48Q"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Build the service and do the query"
      ],
      "metadata": {
        "id": "meUIGd5pOCTT"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Get the URL from the previous request\n",
        "# query_url = lamost_dr6['access_urls']\n",
        "query_url = lamost_dr6['access_urls'][0] # Extract the first URL from the list\n",
        "print(f\"Query URL: {query_url}\")\n",
        "\n",
        "# Build the service object\n",
        "lamost_service = pyvo.dal.SSAService(query_url)\n",
        "print(f\"SSAService object created: {lamost_service}\")\n",
        "\n",
        "# Performing the query\n",
        "spectra = lamost_service.search(pos=(180.0, 0.0), diameter=3.)\n",
        "\n",
        "# Alternatively, try with a source name\n",
        "#pos = SkyCoord.from_name('VY CMa')\n",
        "#spectra = lamost_service.search(pos=pos, diameter=3.)\n",
        "print(f\"Number of results from LAMOST DR6 MRS query: {len(spectra)}\")"
      ],
      "metadata": {
        "id": "17wRgaZLO4e2"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Manipulate the results and save the table"
      ],
      "metadata": {
        "id": "3UBsl3oLVmCC"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Convert results to an Astropy Table\n",
        "spectra_table = spectra.to_table()\n",
        "print(spectra_table)\n",
        "\n",
        "# Convert to VOTable and write to file\n",
        "spectra_vot = spectra.votable\n",
        "spectra_vot.to_xml('lamost_dr6_spectra.vot')\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(\"Results saved to lamost_dr6_spectra.vot\")"
      ],
      "metadata": {
        "id": "u2wxMa0pV-b1"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Select and save one spectrum from the results."
      ],
      "metadata": {
        "id": "QTFMX9pFWWKY"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Select one spectrum and download data\n",
        "k = 690  # Index of the spectrum to download\n",
        "spectrum = spectra[k]\n",
        "spectrum_url = spectrum.getdataurl()\n",
        "print(f\"Spectrum data URL: {spectrum_url}\")\n",
        "response = requests.get(spectrum_url)\n",
        "\n",
        "# response is natively a VOTable file\n",
        "with open('lamost_spectrum.vot', 'wb') as f:\n",
        "    f.write(response.content)\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(\"Spectrum saved to lamost_spectrum.vot\")"
      ],
      "metadata": {
        "id": "FedjApgvWZez"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Load the spectrum, plot it and save the figure."
      ],
      "metadata": {
        "id": "s8zZ7KdrWoGG"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "\n",
        "# Load the spectrum and plot it\n",
        "spectrum_table = Table.read('lamost_spectrum.vot')\n",
        "plt.plot(spectrum_table['spectral'], spectrum_table['flux'])\n",
        "plt.xlabel('Wavelength (Angstrom)')\n",
        "plt.ylabel('Flux (counts)')\n",
        "plt.title('LAMOST DR6 MRS Spectrum')\n",
        "plt.savefig('lamost_spectrum.pdf', dpi=300)\n",
        "print(\"\\n-------------------------------------------------------------\")\n",
        "print(\"Spectrum saved to lamost_spectrum.pdf\")\n",
        "plt.show()"
      ],
      "metadata": {
        "id": "xlP09LRwWr_D"
      },
      "execution_count": null,
      "outputs": []
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "R2-tUtjqRFIf",
        "ehthEjb_NlrM",
        "cw_nDUcoVCjI",
        "mlKXXEtsZDki",
        "6xkUipwtW_Hj",
        "Y2DA9EhecwJA",
        "4-m8qVuaSfLy",
        "9z7Z37isUX-c",
        "LDpW7lVsaWrb",
        "OmwbFwODbuvR",
        "Ia-CIjeonnGW",
        "oz7AiPhGmyov",
        "JKf2FqVsdZQD",
        "GUQFt9Qo6hjc",
        "gw50C_SHEukO"
      ],
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}