## Research Intersts:

**Computer arithmetic, Computer architecture, Cryptographic arithmetic, Reconfigurable computing on FPGA****Digital FPGA/ASIC design, Digital circuit optimization, Embedded system design**

## Current Research Projetcts

(Ph.D, University of Saskatchewan, Canada):1. Ph.D Thesis: Algorithms, Architecture and VLSI circuit Designs for Decimal Flaoting-Point (DFP) Transcendental Function ComputationDuring My Ph.D, we have furthered several research projects in the field of DFP arithmetic. A series of decimal Fixed-Point (FXP) and DFP arithmetic IP cores have been designed and implemented to support the new decimal formats (

decimal32,decimal64anddecimal128) defined in the IEEE 754-2008. The goal of my Ph.D research is to develop efficient algorithms, architectures, and VLSI circuit designs for DFP transcendental functions computation in order to obtain an improved understanding of potential costs and benefits of the current microprocessor support for the DFP transcendental arithmetic. The term decimal transcendental functions refers to the following DFP mathematical operations: logarithms (log10(DFP)orln(FXP)), antilogarithms (10^DFP), exponential (e^FXP), reciprocal (DFP^−1), square root (DFP^1/2 ) and division (DFP1/DFP2) etc., where DFP presents anyone of the three DFP formats, and FXP presents the decimal significand in DFP formats, which are specified in the IEEE 754-2008 standard.

**Table-Based Decimal/Binary Polynomial Approximation Method**

- Efficient Decimal Logarithmic and Antilogarithmic Converters based on First-Order Polynomial Approximation (
*Started Sept. 2007*) Refer to Publications__[J6]__,__[C4]__,__[C7]__ - A Dynamic Non-Uniform Segmentation Method for First-Order Polynomial Function Evaluation (
*Started Nov. 2008*) Refer to Publications__[J5]__ - A Decimal Transendental Function Evaluation Platform (
*Started July 2009*) On-going

**Decimal Digit-Recurrence with Selection by Rounding Method**

- A DFP Converter by Digit-Recurrence with Selection by Rounding (
*Started May 2008*) Refer to Publications__[J3]__,__[C2]__ - A DFP Antilogarithmic Converter by Digit-Recurrence with Selection by Rounding (
*Started Dec 2008*) Refer to Publications__[J4]__,__[C3]__ - A DFP Combined Division, Square Root and Recirpocal Unit based on Digit-Recurrence with Selection by Rounding (
*Started Dec. 2009*) On-going

**Decimal Functional Iteration Method**

- A Decimal Reciprocal Unit Based on Newton-Raphson Iterations (
*Started Otc. 2006*) Refer to Publications__[C9]__ - Improving Speed for DFP Combined Division, Square Root and Reciprocal Unit based on Newton-Raphson Iterations (
*Started Dec. 2009*) On-going

- Performance Analysis of DFP Transcendental Function Computation

- Function Verification Platform for DFP Transcendental Arithmetic IP based on Intel software DFP Math library (
*Started Dec. 2009*) On-going

2. Collaborative Research Projects

- Design and Tradeoff Analysis of a Binary Floating-Point (BFP) Adder on FPGAs (
*Started Jan. 2007*) Refer to Publication__[J1]__ - Combining ESOP Minimization with BDD-based Decomposition for Improved FPGA Synthesis (
*Started Mar. 2007*) Refer to Publication__[J2]__ - A Radix-100 Decimal Divider based on Non-Restoring Algorithm (
*Started June 2008*) Refer to Publication__[C11]__ - An Efficient Logarithmic Arithmetic Unit for FXP 3D Graphics System Application (
*Started Otc 2008*) Publication in Preparsion - A Multi-Core Elliptic Curve Cryptosystem (ECC) Processor over GF(2^163) (
*Started Mar 2009*) Refer to Publication__[C1], [J7]__ - A Combined Rounding/Normalization Unit for Binary Integer Decimals (BID) (
*Started Otc 2009*) Publication in Preparsion

: (Refer to3. Future Research ProjectsResearch Statement)

- A combined DFP Division/Square Root Unit (as well as support as many as transendental function computations)
- Performance Evaluation for DFP Arithmetic based on BID and DPD Encoding
- DFP arithmetic with SIMD support
- High Perfermance DFP Arithmetic on FPGA

## Previous Research Projects (

M.Sc., Lund University, Sweden):1. Digital Emulation of Analogue CNN Systems on FPGA(Started May 2005, M.Sc Thesis) Refer to Publication[C8], [Fang]Microelectronics promises a high component density and low power dissipation to embedded systems. Unfortunately such a component will always suffer from various error types that make the chip respond differently from its functional simulation. This is especially true for Cellular Neural Networks (CNN), which makes the determination of robust, low-precision parameters to guarantee small footprint and reliable operation an important design consideration. This thesis describes the digital word width effects in a CNN implementation that must be considered to achieve a small size for a reliable system. It discusses the automated design space exploration using a Field-Programmable Gate-Array (FPGA) implementation to perform an optimal CNN parameters selection.

2. Design and Implementation of a Reciprocal Unit based on Newton-Raphson Iteration(Started Nov 2004, IC Project and Verification) Refer to Publication[C10]This research presents the design and implementation of a reciprocal unit, in which the initial approximation of the reciprocal is obtained using a look-up table and a multiplication. How to create a look-up Table efficiently is described in detail, and the error analysis for the ROMs of different sizes is also given in this paper. The presented design utilizes a 2^7 ×16 bits ROM followed by two Newton-Raphson iterations. It takes 10 clock cycles to achieve the 52-bit accuracy approximation of the reciprocal of a double precision floating-point number. The proposed reciprocal unit is implemnted in the Xilinx FPGA and AMS 0.35-um CMOS process respectively. The functionality of the chip was successfully verified.

3. Developed an Accelerator for the Greatest Common Divisor (GCD) on FPGA(Started Jan 2005, Emmbedded System Course Project)Proposed the hardware/software solution for Accelerator of the Greatest Common Divisor (GCD) of

N-numbers problem. The development and testing was carried out using the Xilinx Embedded Development Kit and the MB1000 Virtex II evaluation board.

4. Design and Implementation of A Time Multiplexed FIR-filter in AMS 0.35-um CMOS Process(Started Jan 2005, DSP Course Project) Write RTL code, synthesis and post-synthesis simulation and Place & Route.

5. Developed a VGA controller on FPGA(Started Sept 2004, Introduction of VLSI Architecture Course project) Write HDL code , Simulate it with Modelsim and synthesize with Synplify Pro.

## Industry Experience (

ASIC Verification Engineer, Auvitek International Ltd, China)

- Verified functions of IC product, DTV & ATV demodulators (AU8522 & AU8521); verified IPs, I2C, I2S, IR, PCI etc., for AU8521 and AU8522, on Multi-FPGA verification platform.
- Co-developed the Mariana, the company’s new generation multi-FPGAs verification platform.
- Developed verification plans, wrote bus-functional models with Verilog HDL, integrated IC products into multi-FPGAs verification platform, executed test cases, tracked down bugs and technical problems.