Published February 9, 2024 | Version v3

Data to form periodic lossless ternary seeds of maximum weight (Part 1)

  • 1. University of Manchester
  • 2. University of Leeds
  • 3. Universite de Lille

Description

Data to form periodic lossless ternary seeds of maximum weight.

Detailed information can be found in the GitHub project (https://github.com/vtman/perlotSeeds). Codes to generate periodic blocks (binary and ternary) can also be found there.

Binary seeds can have only two symbols (0 = "do not care" = "_" or 1 = "match" = "#"). The length of a seed is the number of its elements, weight of a seed is the number of its 1-elements. The goal is to find seeds of maximum weight, so they can be used when there are two strings with a given number of mismatches. It is observed that in many cases these seeds of maximum weight have a periodic structure: the same block is repeated multiple times + its remainder. Blocks for binary seeds can be found with the help of the PerFSeeB project (https://github.com/vtman/PerFSeeB). These blocks have the maximum possible weight. 

In genetics, we have four symbols in sequences (A, C, G, T). However, the chance of having a pointwise mutation is not the same for any pairs. A transition mutation (A ↔ G or C ↔ T) is often twice higher than a transversion mutation (A ↔ C, A ↔ T, G ↔ C, G ↔ T). Transition-constrained seeds use ternary alphabet {#@_} where @ is for a match or a transition mismatch. To generate ternary seeds, we first need to generate ternary blocks. These ternary blocks can be found when we use binary blocks. However, sometimes, we need to use binary blocks for less than the maximum weight.

BinaryDataLevel.zip contains binary blocks (mostly of maximum weight, but 1/5 are for smaller weights (less than one and a couple of blocks than two)). 

Files T1V1.zip, T1V2.zip,..., and T8V1.zip contain ternary blocks in binary format. T4V2.zip and T7V2.zip are in the other dataset.

File bestTernary.zip contains ternary seeds of maximum weight (calculated as the number of # symbols + half of @ symbols)

Files

bestTernary.zip

Files (40.0 GB)

Name Size
md5:54878765c3bced83d30d96faa0910744
323.7 MB Preview Download
md5:0aebfe6f40606d851bc5efd2305cd38e
1.3 GB Preview Download
md5:1d2461833902fc4c138eb056c93c1eab
1.9 MB Preview Download
md5:b88f73788a2716a36c5f29499e7dd10b
59.2 MB Preview Download
md5:61a95db8163e8b77a17adaf418d17148
1.4 MB Preview Download
md5:69837929a7b1ddd4c69e173a4b142c9a
3.7 MB Preview Download
md5:74fa8748a8543828fcef2ea98f238c73
1.3 MB Preview Download
md5:e3b0113b1ca64e8feb1f61400584eeaf
562.1 kB Preview Download
md5:e3cfee4b59a2633922e5285722bdb762
392.4 kB Preview Download
md5:0db2f31080e6ae9e64a77c5e0b04bd09
38.8 kB Preview Download
md5:c32b69d380000defcb59ff7667d80720
24.5 MB Preview Download
md5:bcefaf05a7306b51ae5fb067f0e34f29
185.5 MB Preview Download
md5:f1055cd3967fc9436db8c05304c50aac
117.4 MB Preview Download
md5:f85528958205b1f13ada3abceb83ad53
58.7 MB Preview Download
md5:7d487edeaf423922d9a33769f6266394
17.2 MB Preview Download
md5:1fbac4355eaf9bae35e637d3a56affe3
30.8 MB Preview Download
md5:0ae5d93e6b38c548d7b562d8469c53a7
5.0 MB Preview Download
md5:e18f67a923277cab8bb79ff87b6401e7
9.1 MB Preview Download
md5:3e18f0caeb47fb32d133ff87d1a4d5d1
5.0 GB Preview Download
md5:360be249d9538ce35312012d1fc593f1
331.1 MB Preview Download
md5:d883dfe95d30b96bec2581e2688a345d
24.3 MB Preview Download
md5:823c1e476ccf7c54efb7ad31e632606e
151.3 MB Preview Download
md5:2d03f11a8d55e393380030c4a97fa761
67.5 MB Preview Download
md5:23d888b90671b6569b325fdbdd7c8dfe
17.1 MB Preview Download
md5:89f870ecaae59f40ef496d26fea1bf2c
230.7 MB Preview Download
md5:bc694c426c22353e58bbd71cf078b03c
154.1 MB Preview Download
md5:1756b0f470a71bb140f879e5dc12ed3e
237.9 MB Preview Download
md5:22d80b27f7a3f5ec13d1f8412ac68e55
378.7 MB Preview Download
md5:7a101ef17543da2cbd5705d0bbb96b34
8.8 GB Preview Download
md5:3a455dfe5eef81b614c56dc6a9c24e21
14.2 GB Preview Download
md5:c0238b3e8a5e795aa94eab445cb57338
12.7 MB Preview Download
md5:f3754f5e42ebf394aaa0a516fc1f5070
8.4 MB Preview Download
md5:a2f7eca7b7b055ff9f540e6e133c6e32
6.2 GB Preview Download
md5:621f718ba9c63db85bbd5f76f2b17602
2.0 GB Preview Download
md5:e72ecb0d162bab0da142cb7d76ec52da
23.4 MB Preview Download
md5:021cb105b79e6be51e9e2216eb061134
6.6 MB Preview Download