Molecular evolution and depth‐related adaptations of rhodopsin in the adaptive radiation of cichlid fishes in Lake Tanganyika

Abstract The visual sensory system is essential for animals to perceive their environment and is thus under strong selection. In aquatic environments, light intensity and spectrum differ primarily along a depth gradient. Rhodopsin (RH1) is the only opsin responsible for dim‐light vision in vertebrates and has been shown to evolve in response to the respective light conditions, including along a water depth gradient in fishes. In this study, we examined the diversity and sequence evolution of RH1 in virtually the entire adaptive radiation of cichlid fishes in Lake Tanganyika, focusing on adaptations to the environmental light with respect to depth. We show that Tanganyikan cichlid genomes contain a single copy of RH1. The 76 variable amino acid sites detected in RH1 across the radiation were not uniformly distributed along the protein sequence, and 31 of these variable sites show signals of positive selection. Moreover, the amino acid substitutions at 15 positively selected sites appeared to be depth‐related, including three key tuning sites that directly mediate shifts in the peak spectral sensitivity, one site involved in protein stability and 11 sites that may be functionally important on the basis of their physicochemical properties. Among the strongest candidate sites for deep‐water adaptations are two known key tuning sites (positions 292 and 299) and three newly identified variable sites (37, 104 and 290). Our study, which is the first comprehensive analysis of RH1 evolution in a massive adaptive radiation of cichlid fishes, provides novel insights into the evolution of RH1 in a freshwater environment.

In vertebrates, two types of photoreceptor cells located in the retina-cones and rods-are typically responsible for bright-light (photopic) and dim-light (scotopic) vision, respectively (Bowmaker, 2008; but see Musilova et al., 2021). Cones and rods harbour visual pigments, which are composed of an opsin protein bound to a lightsensitive vitamin A-derived, nonprotein chromophore (Bowmaker, 2008). Each visual pigment absorbs a specific waveband in the range of the ultraviolet (UV) to the red light spectrum, and can be characterized by its maximal spectral sensitivity ( ) (Carleton et al., 2020). Vertebrates possess up to four main types of cone opsins, the short-wavelength sensitive SWS1 and SWS2, the middlewavelength sensitive RH2, and the long-wavelength sensitive LWS, plus one rod opsin (rhodopsin, RH1; Bowmaker, 2008;Musilova et al., 2021;Musilova, Cortesi, et al., 2019). These opsins belong to a larger gene family, the transmembrane G protein-coupled receptors (GPCRs), which initiate signalling pathways through the activation of G proteins. GPCRs feature largely conserved structural motifs including seven transmembrane alpha-helices (TM) that form a ligand-binding pocket (Bowmaker, 2008;Palczewski et al., 2000;Terakita, 2005). In visual opsin proteins, the ligand-binding pocket is called retinal binding pocket and binds to the chromophore via a retinal-binding site (located in TM VII) and a Schiff base counterion (in TM III; Bowmaker, 2008;Palczewski et al., 2000;Terakita, 2005).
In addition, a disulphide bridge between two amino acid sites maintains the organization of the seven-helix transmembrane motif by bringing together TM III and the extracellular loop between TM IV and V. Visual opsin proteins are hence characterized by conserved amino acid sites (including those responsible for structural stability and chromophore binding) and highly variable sites, whereby a number of amino acid changes at these latter sites have been suggested to be adaptive by fine-tuning the visual sensitivity to match the local light environment (Bowmaker, 2008;Carleton et al., 2020;Hauser & Chang, 2017;Musilova et al., 2021). Next to amino acid substitutions, adaptive changes in opsins may also occur through gene duplication or loss, or the expression of different opsin palettes (e.g., Carleton et al., 2020;Hauser & Chang, 2017;Lin et al., 2017;Musilova, Indermaur, et al., 2019). Thus, different strategies can be applied to assess whether changes in visual opsin proteins are adaptive and correlate with ecology, including positive selection analyses and photoreceptor sensitivity ( ) estimations (reviewed by Carleton et al., 2020).
In aquatic environments, the visual scenes vary dramatically depending on several different biotic and abiotic factors (Munz & McFarland, 1977;Warrant & Locket, 2004;Yokoyama & Yokoyama, 1996). As light (mostly provided by the sun) passes through the water column, it is continuously absorbed and scattered, resulting in an increasingly narrowed light spectrum with depth. As a consequence, shallow-water species are exposed to UV and a broader range of the visible light spectrum compared to those living in deep waters where blue and blue-green light dominates (Yokoyama & Yokoyama, 1996).
In addition, rivers, lakes and oceans have different optical properties that affect light transmission and absorption in the water column (Yokoyama & Yokoyama, 1996). For example, the presence of phytoplankton and dissolved organic material as well as turbidity reduce light penetration, most notably in inshore and inland water bodies.
Accordingly, the visual system of fish has been shown to evolve and adapt both anatomically and physiologically to divergent light environments (Carleton et al., 2020;Davies et al., 2012;Musilova, Cortesi, et al., 2019;Seehausen et al., 2008). For instance, deepsea fishes have evolved different phenotypic adaptations to catch more photons, such as big eyes, tubular eye structures, reflective tapeta that are wavelength-selective, multilayer retinae and lenses containing yellow pigment (Musilova, Cortesi, et al., 2019;Warrant & Locket, 2004).
RH1 is the single opsin in rods and is responsible for perceiving the achromatic visual information from the environment, typically featuring a -value around 500 nm (i.e., in the blue-green part of the light spectrum; Bowmaker, 2008;Hauser & Chang, 2017;Munz & McFarland, 1977). RH1 is the best studied opsin gene and the first GPCR that had its crystal structure determined (Palczewski et al., 2000). Structural conformation studies found that RH1 exists as dimers and that the oligomeric state of RH1 is important for the activation of G proteins (Fotiadis et al., 2003;Guo et al., 2005). Further, through phylogenetic comparative analyses and in vitro protein reconstructions, specific amino acid substitutions at so-called "key tuning" sites have been identified that change and, hence, the function of RH1 (Hunt et al., 2014;Sugawara et al., 2005;Yokoyama et al., 2008). Previous studies also found that deep-water living fishes show specific molecular adaptations and/or have expanded their RH1 repertoire to efficiently adjust their visual system to the light environment at depth (Hunt et al., 1996;Musilova, Cortesi, et al., 2019;Nagai et al., 2011;Sugawara et al., 2005). For example, more than 20 amino acid substitutions at key tuning sites have been identified that mediate a shift in towards shorter wavelengths (blue-shift), and the emergence of differently tuned RH1 copies has been suggested to increase visual sensitivity in deep waters (Hofmann et al., 2009;Musilova, Cortesi, et al., 2019;Sugawara et al., 2005). Such adaptive changes generally optimize visual sensitivity by matching the peak absorbance of RH1 with the peak wavelength of the environmental light (blue and blue-green light in deep waters, as compared to blue-to-red light in shallow waters ;Carleton et al., 2020;Yokoyama, 2008). Finally, some fish species living in deep waters, where hydrostatic pressure is elevated, were shown to feature particular amino acid substitutions in RH1 associated with protein stability (Porter et al., 2016). More precisely, the steadiness of the protein structural conformation (opsin dimers) and, hence, the function of RH1 is maintained in deep-water fishes through a reduction of the protein's adiabatic compressibility (Porter et al., 2016).
Cichlid fishes (Cichliformes, Cichlidae), which are currently undergoing dramatic adaptive radiations, most notably in the African Great Lakes, are an important model system in evolutionary biology (Kocher, 2004;McGee et al., 2020;Salzburger, 2018). These radiations have resulted in arrays of closely related species living in various habitats, including at different depths. Cichlids thus also emerge as a great model system to investigate the visual sensory system in general and visual opsin genes in particular (e.g., Escobar-Camacho & Carleton, 2015;Musilova, Indermaur, et al., 2019;Musilova et al., 2021;Schneider et al., 2020;Torres-Dowdall et al., 2015;Wright et al., 2020). Previous studies investigated the achromatic visual system of a handful of East African cichlid species (Malinsky et al., 2015;Miyagi et al., 2012;Nagai et al., 2011;Schott et al., 2014;Sugawara et al., 2005;Terai et al., 2017;Torres-Dowdall et al., 2015). A common finding so far is that cichlid species living at depth have evolved similar molecular changes at key tuning sites that may shift towards shorter wavelengths (blue-shift; Sugawara et al., 2005) to better match the peak wavelength of the environmental light. On a small subset of Lake Tanganyika (LT) cichlid species, Nagai et al. (2011) found that most deep-water species have a serine (S) at amino acid position 292, while the majority of shallow-water species feature an alanine (A). Note that the substitution A292S is predicted to shift by about −9 nm and has therefore been suggested to mediate a depth-related adaptation (Musilova, Cortesi, et al., 2019;Yokoyama et al., 2008).
In this study, we investigated the molecular evolution of RH1 in virtually the entire adaptive radiation of cichlid fishes from LT, which represents the ecologically, morphologically and behaviourally most diverse cichlid radiation (Ronco et al., 2021;Salzburger, 2018;Salzburger et al., 2014).

| Identification and assembly of rhodopsin
In a first step, we used bwa-mem (version 0.7.17,  to map all raw reads of each individual to the phylogenetically equi-

| Rhodopsin copy number
To screen for potential duplications of RH1 (see Musilova, Cortesi, et al.,

| Rhodopsin gene trees
To decrease the computational complexity of the phylogenetic analyses, we first reduced the multiple alignments for both CDS and AA to unique sequences. This resulted in 238 unique CDS and 158 unique AA sequences (including the reference sequence).
Ambiguous characters in the AA alignment (due to heterozygous nucleotide sites) were masked by "X." We then used jmodeltest (version 2.1.10, Darriba et al., 2012)

| Rhodopsin amino acid substitutions
To visualize the AA changes in RH1 on the gene trees, we modified the Bayesian and ML trees built with unique CDS sequences (see above), to recover all initial CDS tip labels (reversing the merging of unique CDS sequences). A length of 10 −4 was assigned to each branch for visualization purposes. We then again reduced each

| Positive selection analysis
Positively selected sites were identified using CodeML from the paml software package (version 4.9; Yang, 2007). To assess the impact of different tree topologies, we tested random site models were performed to test for pervasive site-level selection, which is equivalent to random site models in CodeML. We also tested for lineage-specific evolution using the branch-site method aBS-REL (Adaptive Branch-Site Random Effects Likelihood), which is equivalent to branch-site models in CodeML, with shallow-and deepwater living species as categories.

| Depth-related substitution analysis
To investigate putative depth-related substitutions, we retrieved AAs at variable sites for each species living in the shallow and deep waters included in the species tree from Ronco et al. (2021) and recorded each species as a binary AA score (absence or presence of a particular AA) at each position and the binary depth score

| Rhodopsin diversity in Tanganyikan cichlids
We identified and newly assembled the intron-less RH1 gene in 271 cichlid species covering the entire cichlid species flock in LT plus some outgroup taxa and species nested within the radiation. In total, 32 species were represented by one individual, 164 species by two or more individuals that had identical CDS, and 75 species were represented by two or more individuals with different CDS due to incompleteness, homozygous single nucleotide polymorphisms (SNPs) and/or individuals with heterozygous sites (Table   S1). Five incomplete CDS with early stop codons were removed from the analysis: JWA8 (Lamprologus meleagris, Lamprologini), IZI8 (Neolamprologus christyi, Lamprologini), ISA1 (Trematocara marginatum, Trematocarini), JXH4 (Petrochromis orthognathus, Tropheini) and LCF3 (Tropheus sp. "kirschfleck," Tropheini). A visual inspection revealed that all these incomplete CDS were due to regions with no coverage (i.e., lack of raw read data) leading to apparent frame shift mutations (i.e., the lack of nucleotides shifts the reading frame and thereby creates early stop codons). However, because the respective other individual of these five species had a complete CDS and could hence be used for downstream analyses, all species present in the original data set were also included in our analyses. Individuals of 18 species featured differences in the CDS between individuals due to homozygous SNPs, 44 species showed intraspecific differences due to heterozygous sites, and eight species showed differences between individuals due to both homozygous SNPs and heterozygous sites ( Figure S1). The minimum CDS and AA sequence identity was 95.3% and 92.5%, respectively, among all pairs of individuals (AA heterozygous sites masked; the CDS multiple alignment is available on Dryad: https:// doi.org/10.5061/dryad.4mw6m 90c7). No evidence of sex-specific differences was found across the data set. Intraspecific sequence variation was generally low or absent, except for three species that had more than one nucleotide/AA difference due to ho-  Figure S2A). More than 25% of the AA sites in each of these six regions were variable (with and without positions variable due to heterozygosity; Figure S2B). For all individuals including the reference, the retinal-binding site was fixed at K296 (TM VII), the Schiff base counterion at E113 (TM III), and the disulphide bond sites at C110 (TM III) and C187 (extracellular loop) (Bowmaker, 2008;Terakita, 2005). The conserved tripeptide sequence in TM III involved in G protein interactions was fixed at E134/R135/W136 (Menon et al., 2001). Moreover, of the 27 known key tuning sites in RH1 (Musilova, Cortesi, et al., 2019;Yokoyama et al., 2008)

| Rhodopsin copy number
We did not find evidence for multiple RH1 copies in the Tanganyikan cichlids' draft genome assemblies (the full blastn report is available on Dryad: https://doi.org/10.5061/dryad.4mw6m 90c7). The blastn analysis with the respective RH1 CDS recovered from the raw reads as query resulted in a highest-scoring hit to RH1 and a second highest-scoring hit to exoRH1 for all cichlid genomes. The ratio of the median coverage on the reference's RH1 CDS vs. the median coverage on the reference's overall genome did not reveal any evidence for a duplication of RH1, except for the two Neolamprologus splendens individuals that showed a coverage in RH1 of more than twice the genome-wide median coverage ( Figure 1). However, a close inspection of the read data of the two N. splendens genomes revealed a coverage distribution with high variance when mapped to the Nile tilapia reference genome. Although many sites showed comparatively high coverage, the highest proportion of positions was covered by only four or fewer reads, pushing down the genome-wide median coverage in these two specimens compared to the other genomes where the coverage was normally distributed with much less variance. We can attribute the much greater variation in coverage in the two N. splendens individuals to the comparatively high level of fragmentation of the extracted genomic DNA that was initially recovered from them (the original electropherograms indicated a fragment size distributed around 500-700 bp for the two N. splendens individuals, while the other samples' fragments were typically centred around >10 kb). This, in turn, is probably a consequence of these two specimens having been collected more than 30 years ago, whereas the vast majority of the remaining Tanganyikan cichlid genomes were sequenced from recent material producing high-quality DNA extracts (for details see Ronco et al., 2021). Furthermore, since the RH1 CDS of the two N. splendens individuals did not show any excess of heterozygous sites, as would be expected in the presence of a second, functionally different RH1 copy in a genome, we tentatively assume that the observed signals of an elevated median coverage in RH1 in N. splendens are artefacts.

| Rhodopsin gene trees
The   Ronco et al., 2021), whereas species-level relationships often remained unresolved. This is largely due to the overall short length of the RH1 CDS and AA sequences, and the relatively small number of variable positions (1/5 of the sequences) in proportion to the number of taxa analysed. For the same reasons, the Bayesian posterior probability and ML bootstrap values were low in parts. The topology tests revealed that the ML CDS and AA trees were more likely than the Bayesian trees. Therefore, we mainly present results using the ML trees in the following. The haplotype genealogy based on the ML gene tree (Figure 2) showed that, if species shared the same RH1 gene sequence (i.e., haplotype), this only occurred between species within but not between tribes (except for Tropheini and Haplochromini, but which should be considered as one clade because Tropheini is phylogenetically nested with Haplochromini; Salzburger et al., 2005). It further appears that within tribes-specifically in the Ectodini, Lamprologini and Tropheini/Haplochromini-shallow-, intermediate-, or deep-water living species can share the same RH1 haplotype (Figure 2).    and HA (with both the Bayesian and ML gene trees) and the category "shallow-water living species" as background branches and the category "deep-water living species" as foreground branches did not reveal a difference in RH1 sequence evolution between these groups (Table S6; aBS-REL results using hyphy were congruent with CodeML results).

| Depth-related substitution analysis
We found evidence for 23 depth-related substitutions in our data set; We found that, among the AA variants at sites probably evolving in dependence on the depth, the commonly observed pattern is that certain AA variants (F37, I104, I290, S292, A297 and A299, and with weaker effect also N33, S42, T95 and A213, Figure 5b; Figures

| DISCUSS ION
Rhodopsin (RH1) is the only visual opsin type present in the rod cells of the retina, and is responsible for dim-light vision in vertebrates (Bowmaker, 2008; but see Musilova et al., 2021). While in terrestrial vertebrates adaptive changes in their single copy of RH1 are commonly found in association with a nocturnal lifestyle or crepuscular activity patterns (e.g., Castiglione & Chang, 2018;Hauzman et al., 2017), molecular adaptations in RH1 of fishes-as the main group of vertebrates living in the aquatic realm-are most often associated with the water depth at which a species occurs, and in some cases also with water turbidity (Musilova et al., 2021). In addition to acquiring specific changes in the protein-coding sequence of RH1 in response to the light environment in deeper waters (Hofmann et al., 2009;Hope et al., 1997;Hunt et al., 1996;Malinsky et al., 2015;Musilova, Cortesi, et al., 2019;Nagai et al., 2011;Sugawara et al., 2005), some fishes were shown to have expanded their RH1 repertoire via gene duplication and subsequent functional diversification, resulting in intraspecific arrays of differently tuned copies of RH1

F I G U R E 2
Rhodopsin nucleotide haplotype genealogy based on the maximum-likelihood gene tree. Each pie chart represents a unique CDS (coding sequence) of the rhodopsin gene, and its size corresponds to the number of individuals that share this haplotype (whenever two or more individuals share a haplotype, the number of individuals is reported inside the pie charts). Pie charts are colour-coded according to depth category (see box), and cichlid tribes are indicated with coloured background shadings (according to Ronco et al., 2021) (Table S1), starting from a common ancestor that colonized the emerging lake about 10 million years ago (Ronco et al., 2021).
Our in-depth analyses of RH1 in the endemic cichlid fauna of the LT basin based on whole-genome raw sequence data of 517 specimens from 271 species revealed the presence of a single copy of RH1 per genome across the radiation (Figure 1). We note that for two genomes-namely those of the two representatives of Neolamprologus splendens-a coverage pattern in the CDS of RH1 was retrieved that, when compared to the genome-wide median, would be compatible with a gene duplication scenario for RH1 ( Figure 1). However, upon closer inspection, we tentatively argue that the seemingly higher coverage in the CDS of RH1 in N. splendens is an artefact. Thus, unlike what has been found in deep-sea fishes and in some species of freshwater fishes living in murky waters (Musilova, Cortesi, et al., 2019;Musilova et al., 2021), the adaptation to scotopic light conditions does not seem to have involved lineageor species-specific duplications of RH1 in LT cichlids.
When focusing on RH1 sequence evolution in the adaptive radiation of cichlid fishes in LT, we identified a total of 237 unique CDS

F I G U R E 3
Amino acid substitutions in the rhodopsin protein mapped on the maximum-likelihood (ML) gene tree. The coloured arches around the ML gene trees indicate the tribe to which a species belongs (see Figure S3 and Table S1 for full species names). The number and the letter(s) next to each abbreviated species name represent the number of individuals and their IDs (see Table S1 [CDS tip label]). Each coloured circle corresponds to a change occurring at a specific amino acid position along the rhodopsin protein sequence. TM: transmembrane alpha-helix. (a) Amino acid substitutions in known key tuning sites (circles) and in sites involved in protein stability along the depth gradient (in bold, star-shaped symbol). (b) Amino acid substitutions in positively selected sites (CodeML random-site models M2a and M8). Positively selected sites present in M8 but not in M2a are marked with an asterisk (*)  The pairs of models were tested using a Likelihood Ratio Test (LRT) following a χ 2 distribution. Values of each site class ω 0 , ω 1 and ω 2 are specified for models M1a and M2a. The shape parameters p and q are specified for models M7 and M8. The value ω p corresponds to the positively selected site class for M8. The proportion of each site class is given in parentheses (haplotypes) and 157 unique AA sequences in the data set (excluding the reference sequence in both cases). In the Bayesian and ML gene trees reconstructed from these data, haplotypes and AA sequences clustered according to tribes and genera-thus reflecting phylogenetic relationships as established in a species-tree analysis from genome-wide SNPs (Ronco et al., 2021)-but not according to depth at which a species occurs, nor to feeding ecology (Figure 2; Figures   S3 and S4). This is best illustrated by the haplotype genealogy based on unique CDS of the RH1 gene in LT cichlids (Figure 2) again highlighting that convergent evolution is common in cichlid adaptive radiations (Muschick et al., 2012), in this case on the molecular level. Some of these convergent cases, especially among more closely related species, could also be the result of introgression and/or incomplete lineage sorting (Salzburger, 2018).
Intraspecific sequence variation in RH1 was very low or absent for the vast majority of species, except for three species that featured noticeable degrees of intraspecific sequence variation between the two specimens examined ( Figure S1). There is no obvious reason-for example, with respect to phylogeny, ecology, demography, behaviour or morphology-that could explain why these three species are more diverse in RH1 than others. To some extent, however, intraspecific variation should be expected in instances of adaptive radiation, which are characterized by incomplete lineage sorting and occasional hybridization (see Salzburger, 2018). Lastly, there was no indication for sex-specific nucleotide differences across the data set.
The distribution of the variable (and also of the positively selected) AA sites along the protein was not random in our data set, with the majority of variable sites (60/76) being located in the Nterminal (extracellular side) and the transmembrane alpha-helices I, IV, V, VI and VII (Figure 4; Figure S2A and B), and most (23/31) of the positively selected sites in the N-terminal and TM IV, V and VII (Figure 4). On the other hand, TM II and III, and the intracellular side of RH1 were found to be rather conserved and to have primarily evolved under purifying selection in LT cichlids. Also, the AAs at the retinal-binding site 296, the Schiff base counterion at position 113, and at the disulphide bond sites 110 and 187 were conserved in all cichlid species included in this study, as well as in Nile tilapia, and congruent with what has been reported previously (Bowmaker, 2008;Menon et al., 2001;Terakita, 2005). This suggests that-at least over short evolutionary timescales-some AA sites of RH1 are, more than others, involved in rapid adaptive evolution, which may in part be explained by functional constraints on the respective other regions. For example, the conserved TM III not only contains the disulphide bond site 110 and a fixed tripeptide motif (sites 134, 135 and 136; Menon et al., 2001), but also seven AA sites involved in the formation of the retinal binding pocket (positions 114, 117, 120 and 121, and key tuning sites 113, 118 and 122; Musilova, Cortesi, et al., 2019;Ou et al., 2011;Yokoyama, 2008). By contrast, the regions of RH1 around the dimerization interface (TM IV and V; Fotiadis et al., 2003;Guo et al., 2005) and adjacent to the retinal binding site (position in TM VII) as well as the N-terminal region emerge as mutational hotspots and main targets of positive selection in LT cichlids ( Figure 4).
Because, in fishes, adaptive evolution in RH1 is known to be impacted by the ambient light environment along the depth gradient (Hunt et al., 1996;Musilova, Cortesi, et al., 2019;Nagai et al., 2011;Sugawara et al., 2005), and because the roughly 250 cichlid species in LT cover a vast range of niches including from shallow to deep waters, we were particularly interested in depth-related adaptations in RH1 in this spectacular example of adaptive radiation. We thus applied phylogenetic comparative methods to test for potential associations between the variable AAs and depth, and, in doing so, identified 23 substitutions at 15 AA sites that are putatively involved in deep-water visual adaptations in LT cichlids (Figure 5a; Figure S8).
This list contains AA sites previously suggested to be involved in deep-water adaptations, including three previously known key tuning sites (214, 292 and 299) and one site (213) likely to be involved in the maintenance of protein stability (Hope et al., 1997;Hunt et al., 1996Hunt et al., , 2001Malinsky et al., 2015;Musilova, Cortesi, et al., 2019;Nagai et al., 2011;Porter et al., 2016;Sugawara et al., 2005;Varela & Ritchie, 2015), as well as 11 novel candidate sites that have not yet been implicated with living at depth and for which the exact functions are currently unknown (sites 17, 33, 37, 42, 95, 104, 162, 165, 290, 297 and 298) (Figure 5a; Figures S8 and S9). Our results may thus serve as a starting point for future functional tests to determine the effect of these particular AA substitutions on of RH1.
Importantly, all AA substitutions in RH1 that we identified as candidates for deep-water adaptations in the cichlid adaptive radiation of LT also show signals of positive selection (Figure 3; based on what is already known about their function (or their F I G U R E 5 Depth-related substitution analysis of rhodopsin (RH1) amino acid sites. (a) Dotplots of bayestraits results. Each dot corresponds to the difference in AICs of the independent model and the dependent model (absence/presence of an amino acid at a specific site in RH1 and depth at which species occur) and is colour-coded according to the RH1 regions (TM: transmembrane alpha-helix; see Figure  4). The x-axis shows the positions along the RH1 protein sequence, and the y-axis shows the difference in AICs between the two models. The horizontal dashed line is fixed at zero, meaning that dots above this threshold indicate those amino acids that are associated with the water depth at which a species occurs (shallow-vs. deep-water living species). (b) Dotplots of sites with exactly two amino acid variants associated with water depth (colour-coded as in (a)). Each individual is represented by a single dot, colour-coded according to tribe. The x-axis represents the amino acid variants, and the y-axis represents the shallow-and deep-water living species (plotted with jitter points and without ambiguous amino acid sites for better visualization). Note that sites 292 and 299 are among the known key tuning sites in RH1, with substitutions predicted to shift the peak spectral sensitivity (the blue "-" symbol indicates a predicted shift towards shorter wavelengths, and the red "+" indicates a predicted shift towards longer wavelengths). An extended version of this figure showing all identified sites is provided as Figure S8 occurrence) in other deep-living species of fish. For example, previous work in both deep-sea and deep-living freshwater fishes revealed specific depth-related molecular adaptations at key tuning sites in RH1 (Hope et al., 1997;Hunt et al., 1996Hunt et al., , 2001Malinsky et al., 2015;Nagai et al., 2011;Varela & Ritchie, 2015), including N83, S292 and A299, which all mediate a blue-shift in that is considered adaptive at depth. In our data set, all species featured D83, except for the deep-water living Baileychromis centropomoides (N83) and two out of three Bathybates vittatus individuals Massoko. We found that three of them (V162, S166, A298) are also present in the majority of deep-water living cichlid species in LT, while there is a difference in the fourth one (G297 in the benthic ectomorph in Lake Massoko vs. predominantly A297 in deep-water living species in LT; Figure 5B; Figures S8 and S9). It is of note that V162, which was also among the three sites with the largest number of different AAs across the data set (the others being 213 and 217; Figures S2 and S9), was also retrieved with the depth-related substitution analysis (just as A297 and S298 were).
Without doubt, however, it would be necessary to determine the spectral sensitivity properties of the newly identified variants of RH1 by measuring the absorption spectrum of the RH1 pigments (as, e.g., reported in Sugawara et al., 2005), to confirm whether these indeed mediate the hypothesized blue-shift of in deepwater living Tanganyikan cichlid species or contribute to protein stability at depth.

| CON CLUS ION
Cichlid fishes are an important model group to investigate the visual sensory system in general and visual opsin genes in particular in the aquatic environment. In this study, we present the first all-inclusive analysis of RH1 molecular evolution in an entire massive adaptive radiation, that of cichlid fishes in LT. Our in-depth genomic investi- depth. These include three known key-tuning sites, one site with a putative function in protein stability with respect to water depth, as well as 11 novel candidate sites for deep-water adaptations in (cichlid) fishes. Importantly, all the candidates we identified based on the depth-related substitution analyses also emerged as potentially important sites based on our molecular evolutionary inferences, phylogenetic comparisons and positive selection tests. Together, our integrative study on the molecular evolution of the visual system of cichlid fishes in LT provides a comprehensive view on the patterns of RH1 evolution in a freshwater environment and more generally on the evolutionary dynamics of environmentally driven adaptations.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests.

AUTH O R CO NTR I B UTI O N S
V.R., W.S. and Z.M. designed the project. V.R. performed the rhodopsin analyses under the supervision of F.R., Z.M. and W.S. V.R. and F.R.
performed the coverage calculations and the depth-related substitution analysis. V.R. and W.S. wrote the initial manuscript draft. All authors commented on the manuscript and approved the final version.