A high-quality genetic reference database for European commercial fishes reveals substitution fraud of processed Atlantic cod (Gadus morhua) and common sole (Solea solea) at different steps in the Belgian supply chain

Seafood is an important component of the human diet. With depleting fish stocks and increasing prices, seafood is prone to fraudulent substitution. DNA barcoding has illustrated fraudulent substitution of fishes in retail and restaurants. Whether substitution also occurs in other steps of the supply chain remains largely unknown. DNA barcoding relies on public reference databases for species identification, but these can contain incorrect identifications. The creation of a high quality genetic reference database for 42 European commercially important fishes was initiated containing 145 Cytochrome c oxidase subunit I (COI) and 152 Cytochrome b (cytB) sequences. This database was used to identify substitution rates of Atlantic cod (Gadus morhua) and common sole (Solea solea) along the fish supply chain in Belgium using DNA barcoding. Three out of 132 cod samples were substituted, in catering (6 %), import (5%) and fishmongers (3%). Seven out of the 41 processed sole samples were substituted, in wholesale (100%), food services (50%), retailers (20%) and catering (8%). Results show that substitution of G. morhua and S. solea is not restricted to restaurants, but occurs in other parts of the supply chain, warranting for more stringent controls along the supply chain to increase transparency and trust among consumers. Graphical abstract (separate file, optional) Running title Substitution fraud of Atlantic cod and common sole along the Belgian food supply chain


Introduction
Seafood has a worldwide key role in the human diet as a source of popular and healthy food. After a steady increase since the 1950s, worldwide global fish production peaked at 171 million tonnes in 2016, of which 151 million tonnes were used for human consumption (FAO, 2018). As a consequence, fish stocks are under pressure globally (Bryndum-Buchholz et al., 2019;Worm et al., 2006), which is aggravated by climate change (Cheung et al., 2013;Smalås et al., 2019). Pressure on supplies has led to fraudulent practices, such as illegal, unreported and unregulated (IUU) fisheries (Schmidt, 2005) and substitution of higher value species with cheaper alternatives (Fox et al., 2018;"Young's Seafood," n.d.). Apart from economic gain, substitution of seafood can also be driven by (1) economically valuable species becoming a limited resource with increased demand (FAO, 2018;Rehbein, 2008), (2) a wide range of visually similar fish being traded (Piñeiro et al., 2001), (3) consumers seeking easy access to low-cost foods (Fox et al., 2018), and (4) attempting to conceal IUU fishing practices (Fox et al., 2018;Vandamme et al., 2016).
Furthermore, substitution of fish seems to be present mainly at the end point of the supply chain, more specifically in restaurants, canteens and food services, as a consequence of less stringent controls and of processing, which makes the fish less recognisable (Christiansen et al., 2018;Kappel, 2016;Shehata et al., 2017). Little is known about the prevalence of substitution in other parts of the supply chain (Manning and Soon, 2014). A knock-on-effect can be expected, as every subsequent part of the supply chain adds to the original deception (Gordoa et al., 2017). Correct information is critical to help consumers make informed choices (Bénard-Capelle et al., 2015;Meikle and McDonalds, 2013;Mottola et al., 2013) and to increase transparency and safety in the seafood industry.
https://doi.org/10. 1016/j.fct.2020.111417 CC-BY-NC-ND license Substitution might be more prevalent in highly processed seafood products, because morphological characteristics are no longer visible. The rise of molecular techniques which allow distinguishing species without morphologically deterministic traits now enables investigating substitution in highly processed products Bénard-Capelle et al., 2015;Devloo-Delva et al., 2016). DNA barcoding is a popular technique to quickly discover mislabelling (Bénard-Capelle et al., 2015;Christiansen et al., 2018;Hebert et al., 2003;Rees et al., 2014), relying on the amplification of marker genes, like Cytochrome oxidase c subunit 1 (COI) and Cytochrome b (cytB). Both regions in the mitochondrial genome allow for the identification of species upon comparing sequences to a reference database (Ward et al., 2005). The COI gene is an excellent barcode marker as the universal primers are robust, enabling recovery of its 5' end from representatives of most animal phyla (Folmer et al., 1994;Hebert et al., 2003). In addition, the cytB gene is used frequently for fish identifications, as its phylogenetic performance is comparable to that of COI, but its greater length increases species level identifications for some groups (Kochzius et al., 2010;Sevilla et al., 2007;Zardoya and Meyer, 1996).
Unfortunately, public databases still contain erroneous sequences that are linked to incorrect species names (Li et al., 2018;Mioduchowska et al., 2018), which complicates the correct assignment of species names to DNA sequences when used as a control or comparison tool. To cope with this, high quality databases have been created, such as the Barcode of Life Data Systems (BoLD) ("Bold Systems v4," n.d.) and Fishtrace ("FishTrace," n.d.). BoLD serves as a reference database for COI sequences, containing species name, voucher data, collection record, identifier, used primers and trace files.
Fishtrace is a genetic catalogue for cytB and rhodopsin sequences of commercially important fish species (Ratnasingham and Hebert, 2007;Sevilla et al., 2007). A high quality reference database containing DNA sequences for both most commonly used barcode markers, obtained from morphologically identified and digitally vouchered specimens for the most important fish species traded on the European market is needed to further increase the accuracy of DNA barcoding of fishes in Europe. https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license As there are differences in substitution rates between species and countries (Bénard-Capelle et al., 2015), this study focused on substitution fraud for two fish species in the Belgian food supply chain.
Atlantic cod (Gadus morhua) was selected due to its popularity ("Vis -VLAM," n.d.) and its high selling price in Belgium. It is a highly imported fish species in Belgium, in spite of declining stocks in the southern part of the Atlantic (Cook, 2018). Common sole (Solea solea) is one of the most commonly landed fish species by Belgian fishermen. Sole is an expensive product and considered a delicacy in Belgium (" Vis -VLAM," n.d.).
In this study, three specific objectives were addressed: (1) to initiate the creation of a reliable reference database for both COI and Cytb sequences of commercially important species in Europe, to ensure correct identification through DNA barcoding; (2) to obtain insight into the current situation of seafood and fish trade in the Belgian supply chain and (3) to evaluate the occurrence of substitution fraud for Atlantic cod G. morhua and common sole S. solea along the different steps of the Belgian food supply chain.

Creating a digital platform
Existing databases either contain the COI gene (Bold) or the cytB gene (Fishtrace) or do not allow to upload images of specimens (GenBank). Therefore, an online platform was developed using an SQL database (Microsoft, Redmond, USA) to store all sequences and images and all metadata of the collected fish specimens of different European commercially important fish species. The database features a hierarchical storage system with digital images of morphologically identified seafood specimens, information on sampling protocols, lab protocols, storage location, next to DNA sequence data in fasta format for each vouchered specimen. The SEAFOOD TOMORROW database is hosted by smarterasp.net (BusinessICS Intl Limited, Monterey Park, CA, US) and is freely accessible after registration (http://seafoodtomorrowdata.eu/). https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license

Collection and sequencing of voucher specimens
Fish specimens from different geographic regions (Baltic sea, Inland waters of North East Europe, Northeast Pacific, North Sea, Oceanic Northeast Atlantic waters) and from aquaculture were collected by researchers from different institutes (ILVO, ZUT and the University of Porto), vouchered and stored at -20°C. From every specimen, three fin clips and three muscle tissue samples were taken and stored in ethanol at -20°C. DNA was extracted from one of these tissues and also stored at -20°C. The COI gene, cytB gene and Rhodopsin gene were amplified using the different institutes' protocols.
Rhodopsin was used only in cases where the two other markers did not suffice to discern species.
Detailed information on the primer sets used by each institute is presented in supplementary Table S1.
Details on the DNA extraction kit, Polymerase Chain Reaction (PCR) protocols and all available metadata for each specimen are listed in the SEAFOOD TOMORROW database.

Sequence data analysis
To ensure all sequences were assigned to the correct species name, phylogenetic trees for each marker gene were built in R (R Core Team, 2018) using the following packages: seqRFLP (Ding and Zhang, 2012), msa (Bodenhofer et al., 2015), seqinr (Charif and Lobry, 2007), ape (Paradis and Schliep, 2018), Biostrings (Pagès et al., 2018), ggplot2 (Wickham, 2016) and ggtree (Yu et al., 2018). Sequences from the SEAFOOD TOMORROW database were converted to a fasta file after which they were aligned using ClustalW (Thompson et al., 1994). A matrix of pairwise distances from the aligned sequences was calculated based on sequence similarity. Inter-and intraspecific p-distances, calculated using the adhoc package (Sonet et al., 2013), were investigated to identify the presence of a barcoding gap.
Thresholds for both COI and cytB genes were identified, below which samples were considered to belong to the same species. P-distance frequency graphs and error rates at different thresholds graphs were plotted using R (R Core Team, 2018) and ggplot2. Cumulative errors were calculated based on false positives and false negatives for a range of threshold values, to define the optimal threshold for https://doi.org/10. 1016/j.fct.2020.111417 CC-BY-NC-ND license discriminating between two species (Wiemers and Fiedler, 2007). A false negative occurs when two sequences of the same species are identified as different species (threshold too low). A false positive occurs when two sequences of different species are identified as the same species (threshold too high).

Scoping and sampling the Belgian fish supply chain
The Belgian supply chain for fish was assessed through interviews with local stakeholders and scientists, and additional information from reports and literature (Blondeel et al., 2016;Fox et al., 2018;Verlé et al., 2016).
As identifying intact fish usually does not require molecular techniques, this study investigated only 'processed' fish products, i.e. where the whole specimen was no longer present or recognisable. A power analysis indicated that at least 30 samples per trader and per fish species were needed to ensure enough power to detect statistical differences in substitution rates between the different steps in the supply chain. For sole, this was problematic because most sole were traded directly from the auction to fishmongers and restaurants, or were sold only as whole specimens along the food supply chain.
All samples were anonymously purchased in Belgium. Import samples were analysed after being purchased by a company, which allowed samples to be traced back to the international exporter. The following metadata were recorded: collection date, name of the purchaser, species name and commercial name of the food product as mentioned by the trader, scientific name of the fish (if applicable), a photograph of the product (when possible), name and location of the trader, type of food product (fillet, meal or heavily processed), and brand (if applicable). After collection, the fish products were given a unique code and stored at -20°C.

Extraction, amplification and Sanger sequencing
Samples from the substitution part of the study were not stored in ethanol. Total DNA was isolated from approximately 200 mg of food product using the NucleoSpin® Food kit (Macherey -Nagel GmbH https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license & Co. KG, Düren, Germany) following the manufacturer instructions. For amplification three different primer sets were used (supplementary Table S1). The first genetic marker was COI, for which primer set 1 was used. The COI marker gene was incapable of clearly distinguishing Limanda aspera from Limanda limanda, therefore the cytB marker gene was amplified with primer set 2. In case of degraded DNA, a shorter fragment (364 bp) of the cytB gene was amplified with primer set 3.
PCR reactions were performed in volumes of 40 µl containing 25 µl VWR Red Taq DNA Polymerase Master Mix (VWR International, Oud-Heverlee, Belgium). For each reaction 0.2 µM (4 µl) forward and reverse primers were added. The same volume of DNA was added to all reactions for the same genetic marker (2 µl for the COI gene and 2.5 µl for the cytB gene). PCR graded water was added until a total reaction volume of 40 µl was obtained. PCR cycling conditions for the COI fragment (primer set 1) consisted of 2 minutes at 94°C, followed by 35 cycles of 30 seconds at 94°C, 40 seconds at 52°C and 60 seconds at 72°C. Reactions were finished after a final extension step of 10 minutes at 72°C. For the cytB fragments (primer set 2 and 3), reactions were preheated for 5 minutes at 95°C, followed by 40 cycles of 30 s at 95°C, 30 s at 50°C and 60 s at 72°C. Reactions were finished after a final extension step for 10 minutes at 72°C. PCR products were loaded on a 1 % agarose gel, which was made of 100 ml 1 x TAE buffer (a buffer mixture of Tris base, acetic acid and EDTA), 1 g agarose and 10 µl GelRed® Nucleic Acid Gel Stain (Biotium Inc, Fremont, US). After successful gel electrophoresis, the PCR product was cleaned using Wizard® SV Gel and PCR Clean-Up System (Promega, Madison, USA), following the manufacturer's protocol. Samples were Sanger sequenced by Macrogen (Amsterdam, Netherlands) and GeneWiz (Bishop's Stortford, UK).

Sequence analysis
For visual inspection of the sequencing chromatograms and creation of consensus sequences, Bionumerics v7.6.3 (Applied Maths, Sint-Martens-Latem, Belgium) was used. DNA sequences were then compared against the SEAFOOD TOMORROW database (SeafoodTomorrow, n.d.) with BLAST+ (Basic Local Alignment Search Tool) (Altschul et al., 1990) (NCBI, Bethesda, USA). From the list of BLAST hits, https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license the species with the highest query coverage, lowest E-value and highest identity were chosen. In case identity was below 99 %, samples were also blasted against the public library of GenBank (NCBI, Bethesda, USA) to ensure that this was not caused by absence of the species in the SEAFOOD TOMORROW database.
Samples were considered substituted when the detected species did not match the scientific name on the packaging or sales label. When the scientific name for a certain sample was not mentioned, the resulting scientific name was compared against the allowed commercial names for that species (https://ec.europa.eu/fisheries/cfp/market/consumer-information/names_en).

Quantitative PCR (qPCR) identification of Atlantic cod
Alongside barcoding, a species specific assay with real-time PCR was used to identify Atlantic cod G.
morhua. Briefly, the SureFood® Fish ID Gadus morhua IAAC (R&D version) kit (R-Biopharm, Darmstadt, Germany) was applied following the manufacturer's protocol. The kit distinguishes two signals at different wavelengths. A signal in the VIC channel indicates the presence of fish (internal control), while a signal in the FAM channel indicates the presence of G. morhua. As such, when a sample contains G. morhua both channels should detect a signal. If it is a different species, only the VIC channel will indicate the presence of fish. A Light Cycler 480 I (Roche, Basel, Switzerland) was used to run the qPCR assay. When a product was suspected of substitution, the sample was sent for DNA sequencing, following the protocols as described in paragraph 2.2.2.

The SEAFOOD TOMORROW reference database
Currently, the SEAFOOD TOMORROW reference database holds information about 42 fish species of economic importance in Europe. In total 300 sequences were generated: 145 COI, 152 cytB and 3 Rhodopsin genes (Table 1). Phylogenetic trees of the cytB and COI genes (supplementary material Figure S1 and S2 respectively) showed that conspecifics cluster together and that congeneric species were more closely related than non-congeneric species, illustrating that all sequences were linked to the correct species names. The frequency plots ( Figure 1A and 1C) represent p-distances created by comparing all sequences present in the SEAFOOD TOMORROW database for COI and cytB genes respectively, and illustrate no overlap between intra-and interspecific distances. The optimal threshold is defined as the threshold at which species can be identified without any false negative or false positive result. The optimal threshold for the COI gene ranges from 2 to 4 %, the optimal threshold for the cytB gene ranges from 1.6 to 5.6 % ( Figure 1B and 1D).  CC-BY-NC-ND license Eleven steps in de Belgian fish food supply chain were identified: (1) fishermen (the people who catch the fish); (2) the fish auction (where the catch is landed); (3) export (fish sold to countries outside of Belgium); (4) import (fish sold by countries outside of Belgium; (5) the fishermen's market (sells fish bought by fishermen directly to customers); (6) wholesale (bulk acquisition and selling of fish food products between traders); (7) processing, which includes both preparation (filleting, gutting and boning of a fish) and effective processing (the addition of ingredients, changes in temperature or breading); (8) fishmongers (specialised in trading fish); (9) retailers (stores where a large variety of food products are sold, including fish); (10) catering (services where food tailored to needs is ordered individually, e.g. restaurants); and (11) food services (where food is prepared in bulk, e.g. canteens).

The Belgian fish supply chain
After fish are landed, the catch is transported to three potential locations: the fishermen's market, "De Vistrap", located in Oostende and Nieuwpoort , the auction hall and to exporters. The fish auction hall, in turn, sells fish to four potential customers: processors, wholesalers, fishmongers and companies outside Belgium (export). Wholesalers and processors also import fish from outside Belgium or export their products. Retailers and food services buy fish from wholesalers and processors. Fishmongers buy fish from the auction hall and from wholesalers. Catering buy fish from wholesalers, fishmongers and, in certain cases, directly from the fish auction. Additionally, wholesalers and processors sell products back and forth between one another and, often, companies will engage in all of these activities (preparation, processing and wholesale). The network is presented in Figure 2. https://doi.org/10.1016/j.fct.2020.111417

Substitution of cod and sole in the Belgian food supply chain
As this study focused on 'processed' fish products, we did not assess fishermen nor the fish auction.
Also, 'Export' was not accessed as this study focused on the 'real' Belgian food supply chain, leaving eight steps of the Belgian food supply chain that were sampled for 'processed' cod and sole (Figure 2).
A total of 139 'processed' cod products were sampled: 2 from the fishermen's market, 25 from import, 1 from wholesalers, 18 from processors, 33 from fishmongers, 16 from catering, 35 from retail, and 9 from food service. A total of 46 'processed' sole products could be sampled: 3 from wholesalers, 1 from processors, 26 from catering, 14 from retail and 2 from food service. Samples for both cod and sole were collected mainly in the western (Flemish) part of Belgium ( Figure 3A). Imported cod samples originated mainly from the United Kingdom, Iceland and Denmark, next to different East European countries and Russia ( Figure 3B).  Of the 185 samples collected, 182 were identified successfully. One cod sample from a fishmonger could not be amplified with either qPCR or normal PCR using any of the primer sets, as the DNA was found to be of poor quality. Two cod samples that could not be identified by qPCR were properly identified by DNA barcoding. One of the two cod samples that could not be identified with barcoding, was successfully identified using qPCR. One sole sample from a wholesaler could be amplified with PCR for both the COI and cytB gene, but sanger sequencing chromatograms were highly contaminated. A second sole sample was amplified with the cytB gene and sanger sequenced but could not be identified reliably, as the best match on GenBank was only 91 % with the species Cynoglossus joyneri. This clearly https://doi.org/10.1016/j.fct.2020.111417

CC-BY-NC-ND license
CC-BY-NC-ND license indicated that this sample was not sole, and as such was retained in the further analyses. As a result, a total of 138 cod samples and 45 sole samples were used for further analysis. Table 2 shows the ratios of successful identifications for both DNA barcoding and qPCR for different food types. Some samples were barcoded with both COI and cytB, and when cod samples were suspected of being fraud, they were also barcoded to confirm the substitution. Additionally, qPCR was not used at the start of the study and not every cod was analysed with qPCR. No perceivable difference in the success rates of identification was found between the techniques (for cod samples only) or the grade of processing (both cod and sole samples, Table 2). In total, 138 cod and 45 sole food products were identified molecularly. Certain food products from the same retailer were collected multiple times, sometimes in a different city. The outcome was identical for the same products of the same retailer but from different stores. To avoid inflation of the substitution rate, only one sample of these food products was retained. As a result, substitution rates were based on 132 cod samples and 41 sole samples.
For cod, one out of 16 catering samples (6 %) was a meal substituted with Pollachius virens; one of 21 import samples (5 %) was a fillet substituted with Melanogrammus aeglefinnus; one of 32 fishmonger samples (3 %) was a processed fish substituted with Gadus chalcogrammus. Substitution occurred in all three food types ( Figure 4A and C, Figure 5A and C).
Substitution rates for sole were as follows: 2 out of 2 for wholesale (100 %), 1 out of 2 for food service CC-BY-NC-ND license 4B). Substitution of sole was found in two 'processed' food products: 2 out of 10 fillets (20 %) and 5 out of 31 meals (16 %) ( Figure 4D). For the meal sample from a food service, S. solea was replaced with L. aspera ( Figure 5B and Figure 5D). Two substituted samples from wholesale (both fillets) were Cynoglossus sp. and S. senegalensis, the latter sold as 'farmed sole'. In five meal samples from retail sole was substituted with L. aspera (4 samples) and Pangasianodon hypophthalmus (1 sample).
Additionally, one of the retail samples sold as sole returned both L. aspera and Lepidopsetta polyxtra.
Also, the substituted meal sample from a food service contained L. aspera. Two sole catering samples were substituted with L. aspera and Microstomus kitt. Two retail samples were sold as "Tongrolletjes" a Belgian dish normally prepared with S. solea, but the label stated it to be lemon sole (M. kitt). These samples were also classified as substituted, as misleading the consumer is equally considered as fraud.

The SEAFOOD TOMORROW reference database
Through DNA barcoding, the amplified marker gene sequence is compared with known sequences linked to species names registered in reference databases (Hebert et al., 2003). The importance of a reliable reference database for correct species identification cannot be overstated (Li et al., 2018;Mioduchowska et al., 2018). Therefore, we initiated the creation of a reference database for economically important European fish species and their prominent substitutes. All specimens were morphologically identified by experts and in-depth quality control of the obtained sequences was performed. For every collected specimen a picture is uploaded and the metadata can be traced back to the specimen itself, but also to the extracted tissues, the extracted DNA from the tissue, the laboratory protocols and the DNA sequences for both COI and cytB genes. The SEAFOOD TOMORROW database is different from the COI targeted BoLD database, as it focuses on commercially important European species and also includes cytB sequences, which allows better species resolution for a https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license number of closely related fishes than the COI gene (Kochzius et al., 2010). Compared to Fishtrace, the current database contains more information, as both cytB and COI sequences for commercially important European fish species are registered, and photographic material is included, which allows to investigate morphological identifications at any point.
Frequency distribution plots of intra-and interspecific distances showed no overlap for both COI and cytB genes. Based on this barcoding gap, correct identification of new samples is expected when sequence similarity is above 96% similarity for COI and above 95% for cytB. The barcoding gap is expected to narrow down when more species are added (Hubert et al., 2008;Kochzius et al., 2010;Pereira et al., 2013;Ward, 2009). Moreover, pitfalls related to species identification using the barcoding gap are well known (Meier et al., 2008), including the incompleteness of the database (Virgilio et al., 2012). One example that showcases this is the occurrence of L. aspera as a substitute in the current study, which is not a European fish, and as such was not included in the SEAFOOD TOMORROW database. However, this species is genetically highly similar to L. limanda (BLASTs reaching >98 % identity for COI and >96 % identity for cytB), as such, even though the identity appears to fall in the previously described thresholds, sequences below 99 % identity were additionally BLASTed against GenBank, here revealing that it was more likely to be L. aspera. Since similarity of the cytB sequences was very low when compared to the SEAFOOD TOMORROW reference database, the sequences of this species were compared against the public databases, and yielded similarities of 100% with L. aspera sequences from Genbank. The effect of the incompleteness of the reference database for species level identification is further illustrated for the sole sample that appeared most similar to species from the Cynoglossus genus. Not all species of the Cynoglossus genus are currently present in public databases so species level identification for this specimen was not possible regardless of the reference database used. This indicated that reliance solely on a threshold does not guarantee that the identification is correct and multiple genes should be checked in order to increase the reliability of the identification. CC-BY-NC-ND license rarely processed and mostly sold as complete fish to wholesalers, fishmongers and catering. Also, retailers mostly sell complete fish, traded from wholesalers. The few companies that do process sole, buy it from the fish auction and convert it to fillets or fish meals, which are sold to retailers and food services. The difference in supply chains for both cod and sole created the ideal scenario to thoroughly evaluate substitution in the entire fish food supply chain in Belgium.

Substitution of cod and sole along the Belgian food supply chain
Atlantic cod G. morhua was only substituted in three out of 132 (2%) samples along the Belgian supply chain. This is lower than the mislabelling rates found in the meta-analysis by Luque and Donlan (2019).
No substitutions were found in retail, confirming a study across North Atlantic countries, which also did not find mislabelling of cod retailed in Belgium or other European countries (Bréchon et al., 2016).
In contrast, a study in France found 11 % mislabelling of cod products collected from fishmongers, supermarkets and restaurants, while another study found 55 % mislabelling for retailers in Italy (Bénard-Capelle et al., 2015;Pinto et al., 2013). The 4 % substitution rate recorded for cod in food services and restaurants is lower than the 13 % described for food services and restaurants in Brussels, the capital of Belgium (Christiansen et al., 2018). In contrast, the current samples were gathered closer to the Belgian coast, where the consumer may have an increased knowledge about fish and seafood products, as such deterring substitution fraud.
All three substituted species for cod belonged to the Gadidae and are very similar to G. morhua in both taste and texture. Polachius virens and Melanogrammus aeglefinnus are frequently landed at the Belgian fish auction ("Vis -VLAM," n.d.), while Gadus chalcogrammus is mainly imported as filleted or processed product ("Vis-& zeevruchtengids -voor professionele gebruikers | Zeevruchten gids," n.d.).
Similar substitutions were found in other studies (Christiansen et al., 2018;Herrero et al., 2010;Luque and Donlan, 2019;Tomás et al., 2017). The processor that imported the M. aeglefinnus fillet that was sold as G. morhua claimed to be unaware of this substitution fraud. Although the substitution rate for cod was low, the results show that substitution does not only occur at the end of the food supply chain.

CC-BY-NC-ND license
The substitution of sole in the Belgian supply chain (17 %) was much higher than the substitution rate recorded for cod, and comparable to the 20 % average for sole noted in the meta-analysis of Luque and Donlan (2019). The substitution rate of 8 % found in catering is comparable to the 11 % described by Christiansen et al. (2018) for Belgium, but substantially lower than the 50 % detected in Germany (Kappel, 2016). The substitution in Belgian retail (20 % of the samples) is lower than the 30 % recorded in Spanish shops and supermarkets (Herrero et al., 2012)  hypophthalmus, L. polyxstra and M. kitt) were used as substitutes when fillets were further processed, i.e. rolled up and covered in sauce. Also in Germany, L. aspera and Cynoglossus were common substitution alternatives for sole (Kappel, 2016;Rehbein, 2008). One of the Belgian processors confirmed that L. aspera is an increasingly popular alternative to sole. Sole is priced between 20 and 40 euros per kg, while L. aspera costs around 6 euros per kg, making it prone to economic fraud ("ISPC," n.d., "VisOnline," n.d.).
Other studies stated that substitution rates in restaurants may be higher due to less stringent controls (Christiansen et al., 2018;Fox et al., 2018;Hanner et al., 2011;Kappel, 2016;Khaksar et al., 2015;Shehata et al., 2017;Vandamme et al., 2016). This was not the case in our study, as substitution was https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license detected at almost every step in the Belgian food supply chain. The course of action against species substitution varies depending on the point of the supply chain it takes place in (Fox et al., 2018) and emphasizes the importance of extensive knowledge of substitution rates in different parts of the supply chain. The current study followed to a large extent the recommendations of the European Commission that were invoked for a coordinated control plan for fish species substitution across its member states (SANCO/12569/2014) in that major species of the Belgian consumer market were targeted in a variety of products (meals, fillets and other kinds of processed products) in different parts of the supply chain. However, our results indicate that the recommendation of collecting a total of 100 samples across different fishes and different steps in the supply chain may be insufficient to achieve adequate substitution rates, since substitution in cod was low but nevertheless present. We encourage future studies to collect a minimum and balanced number of samples in each step of the supply chain and for each species when the aim is to identify steps in the supply chain that are more prone to substitution practices. Such studies will greatly contribute in creating effective control and mitigation measures throughout the supply chain (Fox et al., 2018).

Conclusions
The SEAFOOD TOMORROW DNA barcoding reference database is void of misidentifications and provides a reliable basis for the identification of processed samples for commercially important European fish species, based on both COI and cytB gene markers. In addition, when the query sequence retrieves low similarity scores with any of the sequences in the database, this will point to a non-European fish species as substitute. To further identify such sequences, a comparison with public reference databases is needed. Sequences for commercially important export and import species may be added to the SEAFOOD TOMORROW database, provided they are vouchered and morphologically and genetically identified by experts.
The Belgian food supply chain appears as a complex web of potential interactions between traders.
Still, most fish products have a more or less distinguished pathway, with cod mainly brought into the https://doi.org/10.1016/j.fct.2020.111417 CC-BY-NC-ND license supply chain via import from other European countries, and sole mainly originating from local Belgian fisheries and the fish auction.
Substitution fraud does occur for sole (17 %) and to a lesser extent for cod (2 %), two popular fish species in Belgium, at different steps along the Belgian food supply chain. Considering the huge economic value of these species, these substitution rates also impact the Belgian economy. More stringent control measures are needed to ensure more transparency for consumers, such that they can trust their purchases and the labelled information.

Acknowledgments
We thank the scientists at ILVO who helped with sampling, Tom Putteman for help in the lab, and Bart Ampe for help with the initial power calculations and design for the substitution experiment. This research is part of the EU Horizon2020 project SEAFOOD TOMORROW . This project has received funding from the European Union's Horizon 2020 funding programme, Grant Agreement no. 773400 (SEAFOOD TOMORROW ). This output reflects the views of the author(s), and the European Commission cannot be held responsible for any use that might be made of the information contained therein. M.
A. Faria thanks "Fundação para a Ciência e Tecnologia" the researcher contract and financial support from the project UID/QUI/50006/2019.