DEVELOPMENT OF THE SEARCH METHOD FOR NON-LINEAR SHIFT REGISTERS USING HARDWARE, IMPLEMENTED ON FIELD PROGRAMMABLE GATE ARRAYS

The nonlinear feedback shift registers of the second order in GF(2) are considered, because based on them it can be developed a generator of stream ciphers with enhanced cryptographic strength. Feasibility of nonlinear feedback shift register search is analyzed. These registers form a maximal length sequence, using programmable logic devices. Performance evaluation of programmable logic devices in the generation of pseudo-random sequence by nonlinear feedback shift registers is given. Recommendations to increase this performance are given. The dependence of the maximum generation rate (clock frequency), programmable logic devices on the number of concurrent nonlinear registers is analyzed. A comparison of the generation rate of the sequences that are generated by nonlinear feedback shift registers is done using hardware and software. The author suggests, describes and explores the search method of nonlinear feedback shift registers, generating a sequence with a maximum period. As the main result are found non-linear 26, 27, 28 and 29 degrees polynomials.


Introduction
There is currently rapid development of cryptanalytic systems. One of the main requirements to the main element of the cryptographic stream encryption system -a generator of pseudorandom sequences (PRSs) is an indiscernible of the sequence, complexity, speed and repetition period for PRSs [1]. Cryptographic primitives that meet these requirements are constructions on the basis of linear feedback shift registers (LFSRs).
Common cryptographic algorithms, which are built using LFSRs, are: stream cipher A5/1, used to ensure the privacy of telephone mobile communication of GSM standard [2]; stream cipher E0, used in the Bluetooth protocol [3], etc. LFSR main disadvantage is its linearity, which leads to a relatively simple cryptanalysis [4].

Overview of the problem and formulation of research problems
As an alternative for LFSRs to PRSs generation in the stream cipher nonlinear feedback shift registers (NLFSRs) are proposed. NLFSRs on the basis of stream ciphers are included in Achterbahn [5], Dragon [6], Grain [7], Trivium [8] and VEST [9]. In [10] it is shown that NLFSRs are more resistant to cryptanalytic attacks than LFSRs. Using L cells NLFSRs, a cryptanalyst can take up to ( ) L 2 Ο [11] or as given in [12], a sequence of L ( L 1) / 2 L ⋅+ +bits is necessary to determine the structure of L-bit NLFSR generating this sequence.
At the same time, large size NLFSR generation with a guaranteed period remains an unsolved problem [13]. Only some special cases were considered in [14]. It is known that LFSRs generate maximum length sequence (M-sequence) equal to 2 L -1 if and only if when its characteristic polynomial is primitive [15]. For NLFSRs such property is not found to this day. Small NLFSRs with a maximum period can be built with the help of simulation. Nevertheless, modern computing capacities allow to simulate NLFSRs with a size only to L<35 [16]. In [17], NLFSRs with a size to L<26 are shown. This is insufficient for cryptographic applications that require long periods, for example, 2 128 [18].

Computer Sciences and Mathematics
Non-standard hardware solution for implementation of cryptographic algorithms is the use of field programmable gate arrays (FPGAs) [19]. FPGA allows to construct digital devices with high-level hardware description languages, which reduces the complexity of the development and allows the reuse the code through the use of IP-cores [20].
In [21], a search of NLFSRs that are implemented on FPGAs on the basis of analysis of de Bruijn sequences is carried out. Good statistical properties of NLFSR-generated sequences are confirmed and obtained non-linear 25 and 27 polynomials are given.
FPGAs have a high versatility and reliability, which ensures thorough testing [22]. Programmable logic is attractive due to possibility of providing performance close to ASIC technology, to achieve high throughput with high construction flexibility and low power consumption [23].
One of the important advantages of the implementation of cryptographic algorithms on FPGAs is the ability to construct a parallel and asynchronous architecture, superior to GPU and CPU microprocessor-based solutions in performance [24].
In this paper we study NLFSRs performance implemented on the FPGAs and problems of their optimization. Search method of NLFSRs generating M-sequence (M-NLFSRs) is given. This method is based on a practical synthesis of the results obtained previously. The aim of this research is to explore the possibility of NLFSR implementation of FPGA, as well as the development of an acceptable, for hardware and temporal parameters, the search algorithm for M-NLFSR.

M-NLFSR search method
The registers are used multiplication of only two cells, and such NLFSRs are called NLFSRs of the second order. Later, in NLFSR operation we understand NLFSR of the second order in GF (2). General NLFSR construction for the register, consisting of L4 = cells, is shown in Fig. 1. -value of the i-th register at the time t, Q -generated sequence, bits. Ä denotes a nonlinear function of multiplication and Å -linear function of summation.
Preliminary feedback coefficients ij a are limited that equate some of the coefficients to zero. The total number of variables (initially not equal to zero) of feedback coefficient ij a denoted as 1 n . As has been shown in [25], the number of feedback coefficients ij a for NLFSR with L size is calculated by ratio L n L ( L 1) / 2 =⋅ + , therefore, 1 n can take values in the range 1L 0n n ≤≤. M-NLFSR search method consists of two stages that can be carried out simultaneously or sequentially.
Stage 1: There is a decrease in the test set by excluding NLFSRs not forming an M-sequence. The total amount of rejected polynomials is selected in the range of 90-99 % of the possible amount. The volume of rejected polynomials is limited by available memory and time resources.
Rejected set is defined by an analytical method, which is implemented in software. The analytical method consists of checking the feedback coefficients to meet certain requirements as Computer Sciences and Mathematics detailed in [25,26,12]. The essence of these requirements is to analyze the place, type and arrangement of non-zero coefficients of feedback ij a . Mismatch of ij a coefficients to specified requirements means the inability of NLFSR to form M-sequence and, therefore, its rejection.
After the first stage, we obtain the set of polynomials, acceptable (for time-consuming) to check in the second stage. Let's denote the number of such polynomials as FPGA 0 k . Stage 2: Direct verification of the set of polynomials, obtained after the first stage, is carried out for the possibility of M-sequence generation by them is carried out using computer system.
As shown below, for moving the computational process from CPU on the FPGA, performance (speed) of each of NLFSRs is maintained. This fact allows to tens or hundreds of times increase the overall performance of the complex for M-NLFSR search compared to use only a PC.

The structure of the computer system for M-NLFSR search
Complete structure of computer system, which was used for the calculations, is shown in Fig. 2. USB-TTL converter was used as communication equipment (the maximum data transfer speed of 921 600 bps), performs the function of Universal Asynchronous Receiver/Transmitter (UART) between the FPGA and the PC. Two USB-TTL converters were used to enhance the data rate. They operate in parallel with a maximum data rate.
The software realizes: reading data from the hard disk; formation of a package to send to the FPGA; synchronous (for two USB-TTL converters) data sending to the specified COM port; receiving information through the COM port from the FPGA; decoding, visualization and storage on disk test results carried out in the FPGA.

FPGA-based NLFSR performance analysis
Hardware performance is a number of tested NLFSRs per time unit. Hardware performance consists of the following components: -time for one cycle in the module that implements NLFSR operation (denoted as FPGA cycle t ). Cycle is implementation of cyclic changes in NLFSR state in which all possible combinations are implemented for M-NLFSR, and the system returns to the initial state. Registers change their state in one clock cycle of the reference frequency, therefore amount of these clock cycles in a cycle should be L 2 ; -the number of simultaneously tested NLFSRs per cycle (denote as FPGA , are shown in Fig. 3. . Also FPGA max f significantly affect the temperature factor, which can be partly explains a small deviation observed in Fig. 4.

Computer Sciences and Mathematics
Let's consider in detail how to change the above characteristics on the number of simultaneously tested NLFSRs. , on the number of simultaneously processed polynomials in FPGA is shown in Fig. 5.
As can be seen in the graph, maximum system performance can't be achieved with maximum use of all FPGA elements due to the need to reduce the clock frequency. The optimal number of simultaneously tested NLFSRs in this case is in the range from 79 to 115, after which the time for search begins to increase.
In For comparison, Table 1 shows the time for generation of 1 GB NLFSR implemented in FPGA on different clock frequencies ( FPGA f ).  As can be seen from Table 1, the use of the FPGA-based hardware components for PRSs generation can improve performance (two and more times) as compared to using only the CPU of a personal computer.
It is worth noting that these results are valid for used FPGA model. Using another FPGA model, it is likely to get higher system efficiency. Thus, [28] shows the results of the efficiency of hardware implementations of ciphers of GRACE-S family for various FPGA families. Unfortunately, investigated families didn't contain Cyclone IV, but tested FPGAs of Stratix IV and Stratix V family showed more than two times greater efficiency than FRGAs of Cyclone II or Cyclone V family.

Search results for M-NLFSRs
M-NLFSRs search with a size of L =26, 27, 28 and 29 with the above described method was carried out. Obtained polynomials, as well as time expenditures, are shown below. The use of hardware component will significantly reduce M-NLFSRs search time. As an example, for software search at L=25 was spent 18 days, and by using hardware components for L=26 -less than two days. Estimated time for which would have made a similar search for L=26, using a software method, would constitute more than 3 months. Taking into account the modifications, introduced in the search algorithm for L=27, 28 and 29, we can expect another three-time increase in performance for L=26, while using hardware components. Thus, using this method, we get a general performance increase for search of M-NLFSRs generating M-sequences by more than 150 times as compared with the method using only PC CPU resources. The use of more expensive and productive FPGAs will further reduce the time to search of M-NLFSRs.

Conclusions
The method for M-NLFSRs search using hardware-software system is given. Based on the method, as an example, the result of the search of non-linear 26, 27, 28 and 29 degrees polynomials for generation of M-sequences is given.
Use of FPGA-based hardware component can significantly improve the performance of complex for search of M-NLFSR. The time for M-NLFSR search is increased by more than 150 times as compared to using only the computing power of PC. FPGA-based NLFSR performance capabilities are given. Recommendations on optimizing the performance of the complex to search of M-NLFSR are given.

Computer Sciences and Mathematics
A performance comparison of NLFSR generators performed by using the software and hardware implementation is carried out. It is shown that the hardware implementation of single FPGA-based NLFSR isn't inferior to the performance of its program analogue and in the case of a generator with several parallel NLFSRs can significantly exceed its.
Stable performance of FPGA-based NLFSR is implemented with a clock frequency (generation speed in NLFSR), equal to 725 MHz.
On the basis of the results it can be developed stream ciphers with enhanced cryptographic strength. The application of obtained non-linear polynomial allows to obtain stream ciphers, which are analogues of modern developments in this field, and the use of the described search method allows to find higher degree M-NLFSR.