

# **Implementation of ACRA for Biomedical Image Enhancement**

## with an Improved Design

Y. Lalith Prasad<sup>1</sup>, B. Rajeshwari<sup>2</sup>, Rashmi Seethur<sup>3</sup>

<sup>1</sup>M.Tech (Student), Department of Electronics and Communication Engineering, PES University, Bengaluru, Karnataka, India
<sup>2</sup>Professor, Department of Electronics and Communication Engineering, PES University, Bengaluru, Karnataka, India
<sup>3</sup>Professor, Department of Electronics and Communication Engineering, PES University, Bengaluru, Karnataka, India

**Abstract** - In the modern VLSI technology, today's huge usage and importance of portable systems creates a demand for reducing the heat and power consumption in the circuits. To achieve superior performance, applications consuming low power is mandatory. Based on these reasons, the VLSI technology has a fast growth in creating CMOS circuits that consumes low power. Approximate computing is an effective technique in order to minimize the cost of computing. This technique necessitates a trade-off among the power dissipation of the circuit, delay and its performance. However, the accuracy needs may vary for distinct applications. In certain circumstances, accurate outputs are needed. Hence, this paper presents an accuracy-configurable radix-4 adder (ACRA) that utilizes the power gating methodology to switch on or off an adder's partial logic gates for effectively computing the approximate and accurate outputs. In order to reduce the error distance among the accurate and approximate outputs, the partial sum of a single adder section is altered, when the ACRA is functioning in the approximate mode form. ACRA is compared with other accuracy-configurable adders such as RCA, CLA and RD-4A. The result indicates that the ACRA outperforms other adders in terms of accuracy and the power delay product. Moreover, in this paper, an image processing assessment was carried out for 8-bit ACRA thus achieving higher the peak signal-to-noise ratio and lower mean square error value producing better quality image.

\*\*\*

*Keywords*: RCA, CLA, RD4-A, ACRA, adders, CMOS, image quality, Tanner EDA, power gating.

### **1. INTRODUCTION**

There are a numerous applications [1]–[3], related to image processing many of which need substantial computational effort. Power consumption and Performance are the two main circuit design threats of these applications. Errortolerant applications, utilizes Approximate computation [5]–[7] which is a design technique that associates an interdependencies between the cost of the design and its performance. By using this technology it minimizes the area of the circuit [10], delay time [9] and power consumption [8] by modifying the circuit and its performance computation and hence answers the disadvantages of conventional computing.

Most of the integrated circuits utilizes adder as a fundamental component to perform arithmetic functions. Approximate adder is used in error-tolerant applications to produce sustainable outputs. Many researchers [10]–[14], have

employed a fixed accuracy technique, for designing the approximate adders. Moreover, adders of these type cannot be utilized to compute the accuracy and it's accuracy cannot be modified. Apart from these advantages, like area and high performance, approximate adders faces issues [15] in fulfilling the need for accuracy in various applications. For instance, techniques [15] like image recognition needs high accuracy for computing and image compression needs lower accuracy. Moreover, in certain circumstances [14], a digital system that has a common processor would be capable of implementing accurate and approximate computations together. Sometimes, if the accuracy that is assigned by the adder is low, it may lead to excess energy dissipation [1].Besides, devices based on power must be capable of regulating the power dissipated by computations.

\_\_\_\_\_

In order to enhance the power-delay product (PDP),radix-4 adder (RD4A) [11],operates two bits concurrently [12] as opposed to carry-look ahead adder (CLA) and the ripple carry adder (RCA), that operates only one bit at a time. Thus, this paper presents an accuracy-configurable Radix-4 Adder (ACRA) to minimize the power dissipation in the circuit while performing approximate computations.

The power gating technology is used in the ACRA, where the partial sum of one RD4A and the carry-in related logic gates is improved dynamically in order to decrease the error distance among the accurate and the approximate outputs. The ACRA employs two methods: the accurate and approximate methods; based on the kind of application, the ACRA produces approximate or accurate output in the run time. By utilizing 250-nm CMOS technology, the analysis result shows that the ACRA consumes low power and propagation delay time than that of RCA and CLA. In addition, the ACRA significantly outperforms the other accuracy-configurable adders in image processing depends on the acquired peak signal-to-noise ratio (PSNR) and Mean-Square-Error (MSE) values.

The following are the steps carried out in designing the ACRA.

- 1) An approximate or accurate adder of the ACRA can be designed by regulating the certain logic gates powers supply of the altered RD4A components.
- 2) When operating in the approximate mode, the output computations of the ACRA can be limited with minimum error distance, that is applicable for certain image oriented applications.



 The computational output acquired by the ACRA discloses an excellent changeover in terms of propagation delay time, error distance, and power dissipation.

The sections of this paper are arranged as follows. Section - 2 reviews on the literature of the related adders. Section-3 shows the circuit design of RCA, CLA, RD4-A and ACRA. The experimental results of all these designed adders with power and delay for 8, 16 and 32-bits are rendered in Section-4. Section-5, finally, concludes the study.

#### 2. Literature Review

A new method of performing binary addition, which is a fundamental aspect of microprocessors, has been developed. The method uses a RCA consisting of four FA blocks and a most gate-based CMOS wired logic system. The carry is passed on to the further stage and the Sum and carry outputs are produced using a most gate-based output wired CMOS logic system. This design uses fewer transistors and has less delay than conventional ripple carry adder circuits. The design is tested using the tanner tool [1].

The demand for portable electronics has led to a need for low-power VLSI circuits. Recent research has focused on real time designs for logical functions using pass transistors and transmission gates, but there has been a lack of formal design procedures [2]. The issue can be addressed by designing less transistor CMOS pass network XOR-XNOR cell that can operate reliably within particular limits when voltage supply is low. By utilizing XOR-XNOR cell a FA cell with minimal count of transistors has been showed, with a focus on reducing power consumption and operating at low supply voltages. This research aims to meet the increasing now-a-days for low-power VLSI systems in mobile communications and portable equipment.

The 1-bit FA is an essential component in designing application-specific IC'S for portable devices like PDAs and cell phones that require high-speed with low power consumption. The paper proposes three new 1-bit full adders using XOR and XNOR gates with a delay of 2T, which have reduced delay, less transistor counts, and lower power dissipation than the full adder cells. However, these adders also differ from than the threshold loss issue while cascading but can be used in building bigger circuits with CMOS gates at most stages [3]. The authors compare the power dissipation, delay, and areas of the new FA'S with existing ones and find them to be options for efficient design.

Reversible logic is an important technology in Quantum Computing as it allows for high speed and low power consumption. It can be accurate speed of switching trade-off. The article discusses the implementation of RCA and CLA using reversible logic gates, and the benefits of reversible logic for solving NP-complete problems [4]. The article also highlights the advantages of reversible logic, such as to improve efficiency, it is important to find the ideal block size, use the least amount of additional bits, and ensures that each input has a corresponding output. The RCA and CLA are important is development of arithmetic blocks of Quantum systems. However, the fast logic related results in high energy consumption.

Adders are used for more than just addition, including mul, div, and address measurement. The CLA is a particularly efficient adder due to its ability to save time in propagating carry bits. This article calls the design of an Integrated Circuit (IC) layout for the CLA using the full custom method and an open source software called electric VLSI design system [5]. The layout must adhere to IC design rules and undergo a Design Rule Check before being simulated using LT Spice IV to verify functionality. The miniaturization of electronic devices is made possible by advancements in IC technology, such as CMOS technology which consumes less power. Further research in IC design is important for the production of more efficient and reliable ICs.

The technology used to make chips has improved over time, making them more efficient with reference to power, delay, and area. In VLSI design, creating small, fast, and lowpower digital devices is important. An important issue in the design of adders is area and the speed. Various technologies will be used to design adders, such as adiabatic logic, GDI, ECRL, and transmission gate. RCA and CLA are two techniques used for designing full adders, with RCA being slower but using less power, while CLA is faster but uses more power. In digital circuit design, there is a conflict between speed and power [6].

The CLA is a widely used adder for best performance computing systems. This article proposes an updated data of the 4-bit CLA adder that uses hybrid AND and XOR gates to generate carry propagate and carry will come in the input side. The CLA circuits are similar to the previous circuits in [7]. The initiated design shows remarkable development in performance compared to the accuracy design with reference to the energy consumed, delay, and power delay product. The suggested design require only fewer transistors, than the accurate design, resulting in smaller surface area on the chip and low power dissipation. The most performance computing systems.

### 3. Adders

#### 3.1 RCA 3.1.1 RCA Logic Diagram

A ripple carry adder is a circuit that performs the logical operations where every full adders carry-out is the carry-in of the next most important full adder. Since every carry bits enter with ripples to the next stage, it is known as ripple carry adder. In RCA, at any phase of the half adder the carry out and the sum bits are baseless until the carry-in arises in that phase. This happens because of the propagation delay of the internal circuit. The time that is passed among the input and its interrelated output that has been existed is considered as the propagation delay. For instance, if we take a NOT gate, when the input is "1", the output is "0" and conversely when the input is "0", the output is "1". The time passed for the input "1", to reach the output "0" is considered as the propagation delay in the NOT gate. Likewise, the time taken by the Cin signal and instance at which the Cout signal takes place is known as the carry



SJIF Rating: 8.176

ISSN: 2582-3930

propagation delay. Logical representation of a 4-bit ripple carry adder is shown in the Fig 1. below.



Fig 1. Logical Representation of 4-bit RCA

Thus only right after the propagation delay of Full Adder1, the first Full adder's sum out and carry out are logical. Similarly, only right after the combined propagation delay of Full Adder1 to Full Adder 4, the Sum out of the Full Adder 4 is logical. Finally, only after the combined propagation delays of all full adder circuits existed the result of the ripple carry adder is considered logical.



Fig 3. Design of 16-bit Ripple Carry Adder



SJIF Rating: 8.176

ISSN: 2582-3930



Fig 4. Design of 32-bit Ripple Carry Adder

#### 3.2 CLA 3.2.1 CLA Logic Diagram

A CLA is an electronic device used in digital logic to speed up the process of determining carry bits in addition. Computers rely on arithmetic operations such as add, sub, mul, and div to carry out their functions, with division being repeated subtraction and multiplication being repeated addition. The CLA is a digital circuit used for adding binary numbers. It is an improvement over the RCA, which has a limited speed because of the time taken to produce carry signals. The Ripple Carry Adder is made up of a series of 1-bit adders, while the CLA computes the carry signals prior only based on the input bits and never hold up the input signal to propagate delay at various phases of the adders. Therefore, the adder with minimized delay has been implemented. Logical representation of a 4-bit CLA is shown in the Fig 5. below.



From the above Fig 5, Ai and Bi denotes two input bits and Ci denotes the input of carry from the preceding phase. The outputs Si and Ci+1 are taken for the next phase of the adder. However may be the Cin value, the output carry-in is 1, when the two input bits Ai and Bi are 1 or either of the two input is 1.



Fig 6. Design of 8-bit Carry-look ahead Adder



SJIF Rating: 8.176

ISSN: 2582-3930



Fig 7. Design of 16-bit Carry-look ahead Adder



Fig 8. Design of 32-bit Carry-look ahead Adder

#### 3.3 RD4-A 3.3.1 RD4-A Logic Diagram

The RD4-A, minimizes the propagation delay and power consumption by handling two bits concurrently. The traditional architecture of a 2-bit RD4A is shown in the Fig.9 that consists of  $P_i$ ,  $P_{ir1}$ ,  $Q_i$ ,  $Q_{r1}$ , and  $C_{in}$  as five inputs and Sum<sub>i</sub>, Sum<sub>i+1</sub> and

 $C_{out}$  as three outputs. Here  $P_i$ ,  $P_{i+1}$  denotes the first and second bits of the addend P and  $Q_i$ ,  $Q_{i+1}$  denotes the initial and the next bits of addition Q, respectively. The other variable such as Sum<sub>i</sub>, Sum<sub>i+1</sub> denotes addition of bits of the RD4A and C<sub>out</sub> denotes the carry output. To utilize the logic gate as shown in the Fig. 9, an XOR gate is splitted into two NOR gates and one AND gate.



Fig 9. Logic Diagram of 2-bit Radix-4 adder



Fig 10. Radix-4 Adder 8-bit



SJIF Rating: 8.176

ISSN: 2582-3930



Fig 11. Radix-4 Adder 16-bit



Fig 12. Radix-4 Adder 32-bit

### 3.4 ARCA

#### 3.4.1 ARCA Logic Diagram

The ACRA employs two methods: the accurate mode, that functions like a normal adder and the approximate mode that functions like an approximate adder.

The key technique used in the proposed ACRA constitutes the Power gating technique for redesigning the accuracy computations. Power gating technology is utilized to regulate the transistor switching and output segregated elements, without any extra logic gate in order to scale down the size of the circuit. In the suggested ACRA logic diagram in Fig. 13, K5 to K7 are employed by the power gating technology under various constraints and the other are the logic gates. In Fig. 13, SAPP denotes the input signal that determines the method in which the ACRA should function. When the value of SAPP is 0, the accurate mode of the ACRA is processed and when its value is 0, it functions in the approximate mode. In Fig. 13, of the ACRA the uppermost part is the power supply. The utilized power supplies VDDV R and VDDV B are obtained from the master power supply VDD. K5 and K6 in Fig. 13, are controlled by VDDV R, K7 is controlled by VDDVB, and the other logic gates are controlled by VDD. When the SAPP value is 0, the accurate mode of the ACRA is processed such that it turns on the transistors R1 and R2 inorder to associate VDD to VDDV R and VDDV B. Similarly, when the value of SAPP is 1, the approximate mode of the ACRA is processed such that it switches off the transistors R1 and R2, hence, VDDV R is cut off from the VDD. Depending on the carry-in input signal, the status of the transistor R3 is either on or off. When the value of Cin is 0, it switches on the transistor R3 and the voltage level of VDD and VDDV B are identical. Despite, when the value of Cin is 1, VDDV B and VDD are cut off. Thus, the NOR gate labelled as K7 is switched off in the approximate mode and the value of carry-in is 1.

In Fig. 13, the carry chain consists of two logic gates K1 and K2 and inorder to manage the result of K1 the SAPP signal is utilized by the ACRA. In the accurate method of ACRA, when the SAPP value is 0, the signal at KD1 is (Pi+1 + Qi+1)(Pi + Qi), and the signal at KD2 is Cin (Pi+1 + Qi+1)(Pi + Qi); hence, Carry out can be acquired. In the approximate method of ACRA, when the SAPP value is 1, the values of the signals ND1 and ND2 are 0; hence, Cout can be acquired. Moreover, when approximate computations are carried out on ACRA, the inner signal switching operations of K1 and K2 terminates such that it minimizes the power dissipation.

In Fig. 13, K3 and K4 are subsequent logic gates of K5 to K7.K3 and K4 should be attached to GND in the case when K5 to K7 are switched off such that it operates in normal mode. In the accurate method of ACRA, when the SAPP value is 0, in that case, the value of KD3 is Cin Ai, the value of KD4 is Cin Qi, the result of K3 is Pi Qi + Cin Pi + Cin Qi, and thus Sumi+1 is acquired. When the SAPP value is 1,the K5 and K6 are switched off and the values of KD3 and KD4 are reduced to 0 because of the two NMOS switches on and hence PiQi and Sumi+1 can be computed and finally the result of K3 is obtained.

In Fig. 13, the logic gate K4 produces the signal Sumi.When the SAPP value is 0, the results of K7 and KD5 are similar and Sumi is also acquired. When the SAPP value is 1,



the result of K4 is based on the input value of Cin. The results of K7 and KD5 are similar Sumi = Pi XOR Qi, when the value of Cin is 0. When both the value of SAPP and Cin are 1,K7 is switched off and through the NMOS the KD5 value is reduced to 0,such that the value of Sumi is equal to 1 as the result obtained in Fig. 13.Hence,SAPP and Cin forcibly controls the output of Sumi. This kind of operation is known as dynamic output modification scheme.

The primary architecture of Radix-4 Adder is modified in ACRA where it permits to alter its accuracy computations. To minimize the power dissipation in the approximate mode, the ACRA switches off three logic gates power supply inorder to interrupt the switching operation of K1, K2, and K4. Also it switches off the carry chain such that the Cout signal is self-reliant of the Cin signal. When multi bit adder are linked in series, the ACRA minimizes the delay time by sustaining the Error Distance of 2k and Error Rate of 25%.





Fig 14. Accuracy-configurable RD4 Adder 8-bit



Fig 15. Accuracy-configurable RD4 Adder 16-bit



SJIF Rating: 8.176

ISSN: 2582-3930



Fig 16. Accuracy-configurable RD4 Adder 32-bit

#### 4. Results And Discussions

The schematics are created using Tanner EDA Software. Table 1 lists propagation delay time, power dissipation and PDP of RCA, CLA, RD4-A, and ACRA for 8, 16 and 32-bit adders.

| Adders | 8-Bit            |         |       | 16-Bit           |         |       | 32-Bit           |         |       |
|--------|------------------|---------|-------|------------------|---------|-------|------------------|---------|-------|
|        | Power<br>(watts) | Delay   | PDP   | Power<br>(watts) | Delay   | PDP   | Power<br>(watts) | Delay   | PDP   |
| RCA    | 7.99e-3          | 10.39n  | 87.33 | 6.32e-3          | 10.33n  | 65.28 | 1.90e-2          | 10.28n  | 195   |
| CLA    | 2.50e-3          | 10.80n  | 27.01 | 4.02e-3          | 211.31p | 84.94 | 1.48e-2          | 389.06p | 57.58 |
| RD4-A  | 1.24e-3          | 10.23n  | 12.68 | 5.13e-3          | 29.82n  | 15.2  | 1.02e-2          | 30.15n  | 30.75 |
| ARCA   | 5.79e-3          | 262.28p | 15.18 | 1.81e-2          | 80.10n  | 14.4  | 6.46e-2          | 29.28n  | 18.91 |

**Table 1** Comparison of Power, Delay, PDP of various Adders

Table 1 indicates that RCA consumes less power compared to other adders. The output acquired is obtained is basically because of its simplified structure. But the conventional RCA cannot be implemented in present systems as it has high propagation delay time. In order to enhance the propagation delay time in the circuits, numerous logic gates were utilized in the CLA which resulted in higher power consumption compared with other adders. ACRA when compared with RD4A, resulted in dissipates higher power and longer delay time at the same time it consumes slightly more power and lower delay time, when compared with RCA and CLA.

#### 3.4.2 Medical Image Enhancement

Furthermore, to explore the outcome of the accuracy computations in inspecting the quality of the image, the input images and the output quality of the ACRA images are compared.

The scattering of the dark and light pixels refers to the contrast of an image. In this paper, contrast stretching is

np=3(op-5)+2 ------(1)

where op is the old pixel and np is the output of the new pixel after conversion.

In this study, to show the outcome of the accuracy computations, image enhancement was performed using the 8bit ACRA on the biomedical images. The output obtained for the input image Fig.17, Fig.19, Fig.21 and their corresponding images after enhancement are shown in Fig.18, Fig.20, and Fig.22.



Fig 17. Input Images



Fig 18. Enhanced Images



Fig 19. Input Images



Fig 20. Enhanced Images





Fig 21. Input Images



Fig 22. Enhanced Images

| Table 2 MSE and PSNR Values |            |         |  |  |  |  |  |
|-----------------------------|------------|---------|--|--|--|--|--|
| Figure                      | MSE        | PSNR    |  |  |  |  |  |
| 17/18                       | 62874.2592 | 47.9847 |  |  |  |  |  |
| 19/20                       | 57667.0913 | 47.6093 |  |  |  |  |  |
| 21/22                       | 29518.8119 | 44.7010 |  |  |  |  |  |

In order to evaluate the quality of an image, various quantitative methods and visual inspection of radiologists were used. It is hard to assess the quality of the image enhancement. Desirably in this study, the peak signal to noise ratio (PSNR) and Mean Square Error (MSE) are examined as quantitative measures for assessing its quality.

$$MSE = \frac{1}{mn} \sum_{i=0}^{m-1} \sum_{j=0}^{n-1} (I(i, j) - K(i, j))^2$$
(2)

$$PSNR = 20 \times \log_{10}(\frac{MAX_I}{\sqrt{MSE}}) \tag{3}$$

The results obtained after applying contrast enhancement technique in Table 2 shows that image enhancement using 8bit ACRA yields lower MSE value and higher PSNR value which is the outcome of a high quality image.

### **5.** Conclusions

This paper has presented an ACRA that modifies the accuracy calibration. In order to minimize the overhead of the circuit in the presented ACRA, the power gating technology with settings related to the control signal are utilized. Based on the output analysis, the ACRA excels in accuracy, power consumption and propagation delay time compared with other adders. The ACRA excels the RD4-A with reference to the accuracy computations and the propagation delay time, even though it dissipates more power than RD4-A.Compared to RCA and CLA the accurate method in ACRA has a low rate of PDP. When applying the accurate mode operation in image processing, the proposed ACRA outperforms in image quality.

Hence, the designed ACRA circuit performance excels both in the accurate and approximate operations.

#### REFERENCES

[1]. "Design Of Ripple Carry Adder Using Cmos Output Wired Logic Based Majority Gate", By Mili Sarkar; G. S. Taki; Prerna; Rimi Sengupta; Soham Nandi Ray - in 2017 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON).

[2]. "Low-Voltage Low-Poher Cmos Full Adder", By D. Radhakrishnan - in IEE Proceedings - Circuits, Devices and Systems, Volume 148, Issue 1, February 2001, p. 19 – 24

[3]. "New Improved 1-Bit Full Adder Cells", By Sreehari Veeramachaneni; M.B. Srinivas - in 2008 Canadian Conference on Electrical and Computer Engineering.

[4]. "Reversible Adder Design For Ripple Carry And Carry Look Ahead (4, 8, 16, 32-bit)", By Neelam Somani; Chitrita Chaudhary; Sharad Yadav - in 2016 International Conference on Computing, Communication and Automation (ICCCA)

[5]. "Layout Designing and Transient Analysis of Carry Lookahead Adder Using 300nm Technology-A Review", By Lily Kanoriya; Aparna Gupta; Dr. Soni Changlani - in International Journal of Engineering Development and Research | Volume 4, Issue 2 | ISSN: 2321-9939

[6]. "Design and Analysis of RCA and CLA using CMOS, GDI, TG and ECRL Technology", By Kuldeep Singh Shekhawat; Gajendra Sujediya - in International Journal of Advanced Engineering Research and Science (IJAERS), Vol-4, Issue-11. Nov-2017. https://dx.doi.org/10.22161/ijaers.4.11.19 ISSN: 2349-6495(P) | 2456-1908(O)

[7]. "Performance Improvement of 4-Bit Static CMOS Carry Look-Ahead Adder Using Modified Circuits for Carry Propagate and Generate Terms", By Mehedi Hasan, Moumita Sadia Islam, Muhtasim Rafid Ahmed - in Science Journal of Circuits, Systems and Signal Processing, Volume 8, Issue 2, December 2019, Pages: 76-81

[8]. "Carry based approximate full adder for low power approximate computing", By M. Ramasamy, G. Narmadha, and S. Deivasigamani - in Proc.7th Int. Conf. Smart Comput. Commun. (ICSCC), Jun. 2019, pp. 1-4.

[9]. "FPGA-based multi-level approximate multipliers for high-performance error-resilient applications", By N. Van Toan and J.-G. Lee - in IEEE Access,vol. 8, pp. 25481–25497, 2020.

[10]. "Sensor-based approximate adder design for accelerating error-tolerant and deep-learning applications", By N.-C. Huang, S.-Y. Chen, and K.-C. Wu - in Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE), Mar. 2019, pp. 692-697.

[11]. "Low power Karnaugh map approximate adder for error compensation in loop accumulations", By C. Yang and H. Jiao

- in Proc. Int. Conf. IC Design Technol. (ICICDT), Jun. 2019, pp. 1–4.

[12]. "Low-power digital signal processing using approximate adders", By V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy - in IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 32, no. 1, pp. 124–137, Jan. 2013.

[13]. "On the use of approximate adders in carry-save multiplier accumulators", By D. Esposito, D. De Caro, E. Napoli, N. Petra, and A. G. M. Strollo - in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS),May 2017, pp. 1–4.

[14]. "Design and analysis of an approximate adder with hybrid error reduction", By H. Seo, Y. S. Yang, and Y. Kim - in Electronics, vol. 9, no. 3, pp. 1–13,Mar. 2020.

[15]. "A simple yet efficient accuracy-configurable adder design", By W. Xu, S. S. Sapatnekar, and J. Hu - in IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 26, no. 6, pp. 1112–1125, Jun. 2018.