Multi-source Heterogeneous Ecological Big Data Adaptive Fusion Method Based on Symmetric Encryption

In recent years, with the rapid development of the domestic economy, the concept of sustainable development has been paid more and more attention. Ecological environment protection is more and more important, and the ecological environment is closely related to economic development. How to measure the relationship between the two is very important. Whether it is to build ecological environment protection or to ensure sustainable development of the economy, we should take the green development concept as a guiding concept, promote ecological economic development, and study the integration of ecological data is of great significance for solving these problems. The research of this thesis studies the multi-source heterogeneous (MSH) ecological big data (BD)adaptive fusion based (FM) based on symmetric encryption. This paper sets up a comparative experiment, multi-sensor (MS) data fusion based (DFM) based on Rough set theory, MSH data fusion based on data information conversion. The method is compared with the symmetric fusion MSH BD adaptive FM proposed in this paper. The results show that the MSH DFM based on Rough set theory has the highest confidence of 0.812; the MSH DFM based on data information conversion has the highest confidence of 0.68; based on symmetric encryption MSH BD The fusion confidence of the adaptive FM is up to 0.965, and the MSH ecological BD adaptive FM based on symmetric encryption is superior. of of is different, and 10, and 14 are available. belongs to the block The block of the key bits. The algorithm on a 44 intermediate state is Round of of the encryption key the round of column of of the decryption retrograde retrograde retrograde displacement transformation, retrograde byte transformation, and round encryption. The implementation principles of the above


Symmetric Encryption
(1) Overview of symmetric cryptography Compared with the asymmetric encryption algorithm(EA), the symmetric EA has the characteristics of low computational complexity, small computational complexity, and high security when using long keys. The specific implementation process is shown in Figure 1. According to different encryption methods, symmetric cryptosystems can be divided into two categories: Stream cipher and Block cipher. The Stream cipher encrypts the data packet stream character by character. The Block cipher splits the data packet into data groups and encrypts each group of data.

Figure 1. Encryption and decryption process
We can see from Figure 1, the complete encryption and decryption system consists of at least the following five parts: 1) Message (plaintext) space : all possible packets to be encrypted ; 2) Ciphertext space : a collection of all possible encrypted messages ; 3) Key space : Each set of keys includes an encryption key and a decryption key pair, that is, , so the set of is the key space ; 4) EA Encryption: The corresponding transformation of and space is realized by a mathematical reasoning, physics and the like.
(2) Commonly used symmetric EAs 1) SMS4 EA On June 1, 2016, China released the SMS4 EA. The SMS4 EA is a type of packet symmetric EA, and its packet length and key length are both 128 bits. The basic process is as follows: each encrypted data length is 16 bytes, 16 bytes of input data are equally divided into 4 groups of lengths of bytes, and cyclic operations are performed to produce 32 sets of 4-byte length round keys.
2) DES EA and deformation In 1970, IBM Development Corporation developed the Data Encryption Standard (DES). In 1977, it was the United States Federal Information Processing Standard (FIPS46), which was a type of block cipher algorithm.
Single DES algorithm: Input 8 bytes of data is , and after 8 bytes of length key K encryption, the output data is obtained. See the formula (1) for the encryption formula and the formula (2) for the decryption formula: (1) (2) The Single DES algorithm 56-bit key has complexity, but there is also the risk of being deciphered. In order to improve its security, 3DES avoids attacks by increasing the key length. 3DES EA: triple data EA, that is, three times DES encryption of data. The 3DES algorithm divides the 16-byte length key into two parts, .
Suppose the input data is , encrypted ciphertext block , the encryption formula is shown in equation (3), and the decryption formula is shown in equation (4): 3) AES Advanced Encryption (AES) As an improvement of the DES algorithm, its security and speed are faster than the DES algorithm, but the hardware cost is high. Therefore, without considering the hardware cost, the overall performance of AES has obvious advantages compared to other symmetric EAs. The AES EA differs in the length of the key, and the number , , , , The AES EA mainly involves the substitution/displacement wheel operation. Each round consists of three parts: a) non-linear part: byte replacement by S-boxes by non-linear replacement function, replacing each input byte with a corresponding one. The output byte plays a chaotic role; b) the linear blending part: Shift Rows (State) operation is performed by shifting each row of the matrix, and each column of the matrix obtained in the first step is mixed by linear transformation, that is, linear mixing Mix Columns (State); c) Key plus part: Sub-key XOR or Add Round Key (Sate, Expanded Key). The order of transformation of the encryption process is: byte transform, row transform, column confusion, round key addition, but the last round of no column confusion. The order of transformation of the decryption process is: retrograde displacement transformation, retrograde byte transformation, round key addition, retrograde confusion, but the last round has no retrograde confusion. The tenth round consists of retrograde displacement transformation, retrograde byte transformation, and round encryption. The implementation principles of the above four transformations are described below.
The byte conversion Sub Bytes first finds that the input 8-bit binary number is inversed by the multiplication of the finite field lower modulus and the affine transformation under , and then the binary sequence is added, see equation (6).
For , find the multiplication inversion of , defined as follows: In , the element component is , and the affine transformation is defined as follows: (6) Where is an affine transformation matrix. Line Shift Row Shift Rows row shift is a cyclic right shift transformation of the intermediate state State, and is only related to the state. Different rows have different displacement schemes. When the packet length and key length are 128 bits, the intermediate state matrix is a state matrix of 44, and each row has four 32-bit words. The row displacement is transformed as follows: Column confusion transforms MixColums. The column blending operation is a linear transformation of the plain state matrix, the input blending transformation matrix , and the output blending transformation matrix . We 1 mod can get: (8) The above expression can get the equation: (9) The round key Add Round Key has a key length of 128 bits, and the state matrix State and the round key perform an addition operation in the finite field . 4) IDEA The International Data EA (IDEA) is a block cipher algorithm with a packet length of 64 bits and a key length of 128 bits. In 1992, Lai and Massey created the IDEA EA in the process of improving PES. This algorithm is similar to the 3DES EA. It can also solve the security problem caused by DES due to key length. It is mainly used for email PGP security or file system. Safety.

DFM
(1) Basic principles of data fusion DFM Data fusion is also called MSH information fusion. Data fusion is a research on information processing of a MSH system. Data fusion is the coordinated optimization and comprehensive processing of information from multiple sensors or multiple sources to generate valuable information to obtain more accurate and reliable information. The data. The hardware foundation of data fusion is MSH system. The processing object of data fusion is multi-source information. The core of data fusion is coordinated optimization and comprehensive processing. The purpose of fusion is to obtain more accurate and safe and reliable data information. Data fusion is a basic function that is ubiquitous in humans and biological systems. Humans instinctively integrate the information (environment, sound, smell, etc.) observed by various organs (eyes, eyes, noses, etc.) with prior knowledge in order to timely respond to the surrounding environment and ongoing Time to make an estimate. Because human senses have different metrics, various phenomena occurring in the spatial range can be measured from different angles, and transformed into valuable interpretations of the environment through the fusion of different features.
(2) Hierarchy of data fusion Data fusion can be divided into three categories: data layer, feature layer and decision layer fusion according to its abstraction level at the sensor processing level. Data layer fusion is also called sensor layer fusion. It is the lowest level of fusion. It refers to the direct processing of the data detected by the sensor, and the processing and analysis of the most original information collected by the sensor. The advantage of this type of processing is that it retains as much of the original useful information as possible and provides detailed information that cannot be provided by other levels of fusion models, with less loss of data and high precision. Since the fusion of data levels is carried out at the lowest level, affected by the incompleteness of information acquired by each sensor and the unstable changes of the external environment, the fusion system needs to have strong fault tolerance and robustness. The model of its fusion is shown in Figure 2. We can see from Figure 2, the data layer fusion first collects data information by multiple sensors, filters out the associated data information and transmits it to the data layer for fusion, performs feature extraction and attribute determination, and finally performs joint attribute decision and outputs related data.
Feature layer fusion mostly adopts distributed or centralized fusion structure. The specific model is shown in   We can see from Figure 3, feature layer fusion refers to extracting locally representative data information from different sensors, and then combining these local data to obtain a vector with significant features. These feature vectors are then comprehensively analyzed and processed. In general, the feature information extracted by each sensor is a sufficient statistic of the data information, so the feature layer fusion model will lose some of the useful information more or less, and the fusion performance will be reduced. The advantage lies in the large compression processing of the original data, which is conducive to real-time processing.
Decision-making layer fusion is a high-level fusion. The specific model is shown in Figure 4.  Figure 4, preliminary information on the object is formed after the basic processing operations such as preprocessing, feature extraction, and judgment recognition are performed locally on the information of different types of sensors. Then the fusion center will further integrate the local decision results. Because the decision-making layer is the overall decision result obtained by the fusion process based on the results of each sensor decision. This fusion model has the largest amount of data loss and the lowest accuracy. The advantage lies in the low dependence of the system on the sensor and the strong anti-interference ability.

Adaptive FM Based on Symmetric Encryption MSH BD
(1) Time and space accumulation information acquisition First, a set named is given, and the definition of the set is transformed by the Bel function: is a digital assignment of each uniformity subset in the set , the number range is , for each uniformity The element in the subset is defined by the Bel function, and can be obtained, that is, the reliability of any element in the subset . Assuming that the homogeny from the subset is then its power set is: In the formula, there are elements, which represent the number of elements of the power set . Then, the basic probability assignment function is used to perform the matching of the probability values for the elements in the power set , and the mapping set of the unit interval to the power set subset is obtained: , then the mapping set of the element is . Among them, the subset whose assignment function has a basic probability greater than 0 is the focus element, is the total reliability assigned to , mainly reflects the reliability of itself, and uses the basic probability assignment function for these reliability-related functions. Perform orthogonal summation. Let , be two basic probability assignment functions on the power set , and their focus element sets are , Where represents their orthogonal sum. Their orthogonal sum is a separate data structure, so they use the Dempster rule to effectively fuse their orthogonal and spatio-temporal data: suppose the MSH is used to detect the same subject , and the power set is used to represent all the subsets to be tested. The set, in the power set , each subset is relatively independent, using an empty set to represent the conflict information, and assigning the value to the empty set, and obtaining new accumulated information through the Dempster rule. After the MSHs complete the calculation of the accumulated information, the accumulated information is combined to obtain the spatio-temporal accumulation information of the MSH data. The specific acquisition methods are as follows: Where: represents spatiotemporal accumulation information of MSH data; represents accumulated information combination coefficient. Repeat this step for multiple sensors to obtain time and space accumulation information for all MSH data.
(2) Establishment of adaptive fusion sensor model based on symmetric encryption MSH BD The MSH data intelligent fusion model is established by using the acquired MSH data spatio-temporal accumulation information, and a symmetric fusion MSH BD adaptive fusion sensor model is established.
The model consists of four levels, including object fusion, potential fusion, data source fusion, and process fusion. Among them, the object fusion combines the identity information, the parameter information and the location information to describe the individual in detail, the purpose is to realize the consistent conversion of the MSH data coordinates and the unit; the potential state fusion is to fuse the relationship information of the object; the data source Fusion is the fusion of MSH data sources; process fusion is an intermediate process, mainly to control the performance of data fusion, and to identify the information to improve the fusion effect. After completing these four levels, multi-level and step-like processing of multi-source data can be realized, and intelligent fusion of MSH data is completed.

Setting up the Experiment
In order to detect the adaptive FM based on symmetric encryption MSH BD in the agricultural internet of things proposed in this paper, a comparative experiment is designed. The experiment will use MSH DFM based on Rough set theory and MSH data based on data information conversion. The FM is compared with the symmetric fusion MSH BD adaptive FM proposed in this paper.

Experimental Parameter Settings and Device Settings
The experimental parameter settings are shown in Table 1.  Node-to-node transmission/(kb/s) 100 Number of non-intersecting routes. 3 The experimental parameters of the designed experiment can be seen from Table 1. The sensor node has a transmission radius of 100/m, the data acquisition radius is 20/m, the transmission packet size is 500b/s, and the node transmission bandwidth is 100kb/s, which is disjoint. The number of routes is 3. The equipment configuration used in the experiment is shown in Table 2.  Table 2 shows the experimental equipment and its configuration used in the experiment. The experimental configuration used is MS Windows 7 system (4GB RAM, i5-3470 Intel Core processor); the test equipment is 1 coordinator node, 1 router Node, 6 terminal collection nodes, 1 PC terminal; the link mode between the coordinator and the PC terminal is the serial port, and the format of the serial port print data is two.

Experimental Process
Using one PC terminal and one coordinator node, one routing node, and six terminal acquisition nodes as the test equipment for MSH data fusion confidence of this experiment, using Matlab as the platform of this experiment, MSH data Convergence. In order to ensure the effectiveness of this experiment, the MSH DFM based on Rough set theory, the MSH DFM based on data information conversion and the FM based on symmetric encryption MSH BD adaptive FM are proposed. Confidence comparison. Then compare the calculation timeliness of the three methods, the experiment divides the data into four groups, and expands the data to take 1GB, 12GB, 24GB, 120GB respectively, and the adaptive FM based on symmetric encryption MSH BD and other The two methods are compared experimentally, and the parallel training consumption time is compared. Finally, the data sets of different sizes are set and run on clusters of different size nodes respectively, and the formula obtains the acceleration ratio experiment result.

Analysis of Convergence Contrast Analysis of Experimental Fusion
Convergence confidence comparison is shown in Figure 5.

Algorithm Performance Analysis
(1) Comparative analysis of algorithm timeliness From the aspect of calculation timeliness, this experiment divides the data into four groups, and takes the data expansion to take 1GB, 12GB, 24GB, 120GB respectively, using MSH DFM based on Rough set theory and MSH based on data information conversion. The DFM is compared with the symmetric fusion MSH BD adaptive FM. The result is shown in Figure 6.  Figure 6 that the MSH DFM based on Rough set theory, the MSH DFM based on data information conversion and the comparative analysis method based on symmetric encryption MSH BD adaptive FM are analyzed by Figure 6 shows that when the data processing capacity is 1GB and 12GB, the processing time used by the three methods is not much different. When the data processing capacity reaches 24GB, the symmetric fusion MSH BD adaptive FM proposed in this paper is It shows the advantage of timeliness. When the data processing capacity reaches 120GB, the MSH DFM based on Rough set theory takes 1500s, and the MSH DFM based on data information conversion takes 1350s, based on symmetric encryption. The processing time of the MSH BD adaptive FM is only 500s, and the time used is almost one-third of the other two algorithms. It is enough to see the timedependent aspect of the symmetric fusion MSH BD adaptive FM. Superiority.  Table 3 and Figure 7.   Table 3 and Figure 7, the parallel training time consumption based on the symmetric encryption MSH BD adaptive FM is lower than the time consumed by the other two methods, the efficiency is higher, and the data processing amount is larger. The superiority of the symmetric fusion MSH BD adaptive FM is more obvious.

Acceleration Ratio Experiment Results
Set data sets of different sizes and run them on clusters of different size nodes respectively. The results of the acceleration ratio are shown in Figure 8.  It can be seen from FIG. 8 that when the data set is small, the data processing efficiency is not significantly different. As the amount of data increases, the system platform exhibits an efficient processing rate. Based on the symmetric encryption MSH BD adaptive FM, the processing time growth trend is linear, and the cluster is nearly three times faster than other processing methods. This indicates that the symmetric fusion MSH BD adaptive FM can meet the performance requirements of ecological heterogeneous BD fusion processing.

5.Conclusions
EC is the foundation of the social civilization system. It is important to integrate MSH ecological BD and organize data for sustainable development. Based on this, this paper proposes based on symmetric encryption. MSH ecological BD adaptive FM, and this method is compared with MSH DFM based on Rough set theory and MSH DFM based on data information conversion. The MSH ecological BD adaptive FM is more advantageous.
In this paper, the contrast experiment is set up, and the MSH DFM based on Rough set theory and the MSH DFM based on data information conversion are compared with the MSH ecological BD adaptive FM based on symmetric encryption proposed in this paper. The experimental results show that the MSH DFM based on Rough set theory has the highest confidence of 0.812; the MSH DFM based on data information conversion has the highest confidence of 0.68; based on symmetric encryption MSH BD adaptive The FM has a fusion confidence of up to 0.965. By comparison, the fusion confidence based on symmetric encryption MSH BD adaptive FM is the highest, and the superiority of the proposed method is verified.
In addition, the MSH DFM based on Rough set theory and the MSH DFM based on data information conversion are combined with the MSH ecological BD adaptive FM based on symmetric encryption. Comparing, the results show that when the amount of data processed is not large, the processing time used by the three methods is not much different. When the amount of data reaches a certain level, the MSH DFM based on Rough set theory is obviously superior to the other two methods. Sexuality, timeliness is higher.