(Go to Top Page)

# SASIMI 2019The 22nd Workshop on Synthesis And System Integration of Mixed Information TechnologiesTechnical Program

Remark: The presenter of each paper is marked with "*".
Technical Program:   SIMPLE version   DETAILED version with abstract
Author Index:   HERE

## Session Schedule

 Monday, October 21, 2019

Registration
8:30 -
Opening
9:00 - 9:20
K1  Keynote Speech I
9:20 - 10:20
R1  Regular Poster Session I
10:20 - 11:50
Lunch
11:50 - 13:20
I1  Invited Talk I
13:20 - 14:10
R2  Regular Poster Session II
14:10 - 15:40
D  Panel Discussion
15:40 - 17:10
Banquet
18:00 - 20:00
 Tuesday, October 22, 2019

K2  Keynote Speech II
9:20 - 10:20
R3  Regular Poster Session III
10:20 - 11:50
Lunch
11:50 - 13:20
I2  Invited Talk II
13:20 - 14:10
R4  Regular Poster Session IV
14:10 - 15:40
I3  Invited Talk III
15:40 - 16:30
Closing
16:30 - 16:40

## List of papers

Remark: The presenter of each paper is marked with "*".

 Monday, October 21, 2019

Keynote Speech I
Time: 9:20 - 10:20 Monday, October 21, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

K1-1 (Time: 9:20 - 10:20)
 Title (Keynote Speech) Microfluidics Meets Microbiology: The Journey of Digital Microfluidic Biochips from Laboratory Research to Commercialization and Beyond Author *Krishnendu Chakrabarty (Duke University, USA) Page p. 1 Keyword Microfluidics Abstract Digital microfluidics was transitioned to the marketplace for sample preparation by Illumina a few years ago. Since then, this technology has also been deployed by Genmark for infectious disease testing and Baebies for the detection of lysosomal enzymes in newborns. This lecture will describe the journey from early laboratory research, PhD theses and publication of research articles, to technology transfer and licensing to companies. Despite these success stories, there still remains a significant gap between microfluidics research and its adoption in microbiology. The presenter will describe how this gap can potentially be closed through new directions in digital microfluidics, including recent advances in micro-electrode-dot arrays, acoustofluidics, and countermeasures against malicious attacks on biomolecular protocols.

Regular Poster Session I
Time: 10:20 - 11:50 Monday, October 21, 2019
Chairs: Masashi Imai (Hirosaki University, Japan), Rung-Bin Lin (Yuan Ze University, Taiwan)

Best Paper Award
R1-1 (Time: 10:20 - 10:22)
 Title Energy-efficient ECG Signals Outlier Detection Hardware using a Sparse Robust Deep Autoencoder Author *Naoto Soga, Shimpei Sato, Hiroki Nakahara (Tokyo Institute of Technology, Japan) Page pp. 2 - 7 Keyword Outlier Detection, autoencoder, a sparse network, FPGA, ECG Abstract In recent years, portable electrocardiographs have begun to spread, which enable us to record electrocardiogram (ECG) signals in everyday life. A portable ECG analysis device is needed so that abnormal ECG waves can be detected anywhere. Machine learning techniques, including deep learning, are used in a lot of research to analyze ECG signals since they show more superb performance than conventional methods. However, deep learning models often have too many parameters to implement on mobile hardware. In this research, we propose a method to implement an ECG outlier detector using deep learning techniques in a small builtin device. As a way of detecting outliers, an autoencoder, which is based on neural networks, was used. A sparseness technique was applied to the autoencoder, and the trained autoencoder was implemented on a low-end FPGA. Compared with ARM Cortex M3 embedded processor, the proposed hardware result in 159 times better for energy-efficiency improvement.

R1-2 (Time: 10:22 - 10:24)
 Title A Design Space Exploration Method of SoC Architecture for CNN-based AI Platform Author *Salita Sombatsiri (Osaka University, NEC Corporation, Japan), Jaehoon Yu, Masanori Hashimoto (Osaka University, Japan), Yoshinori Takeuchi (Kindai University, Japan) Page pp. 8 - 13 Keyword Design space exploration, System-on-a-chip, CNN, multi-layer bus Abstract This paper proposes a design space exploration (DSE) method for CNN-based AI platform to find SoC architectures that optimally parallelize massive data computation and data transfer. First, the proposed DSE explores both functional blocks, which undertake a process execution, and their parameters, i.e. the number of instances and PEs, to parallelize CNN's intensive intra-process computation with the ease of system modeling and exploration. Second, a multi-layer bus architecture and configuration are optimized to parallelize data transfer by performing master-slave clustering with three-step channel mapping. Experimental result shows that the proposed DSE with pruning technique found 17 Pareto-optimal architectures from the design space of 2 million architectures within 11.5 hours, which is 21% time reduction compared to the exhaustive exploration.

R1-3 (Time: 10:24 - 10:26)
 Title Reconfigurable Activation Functions for Neural Networks Application Author Yu-Jung Huang (I-Shou University, Taiwan), Meng-Jhe Li, *Wun-Siou Jhong, Shao-I Chu (National Kaohsiung University of Science and Technology, Taiwan) Page pp. 14 - 17 Keyword FPGA, activation function, neural networks Abstract Field programmable gate arrays (FPGAs) have recently become popular for accelerating the deep learning networks due to their parallel processing and reconfigurable capabilities as well as their energy efficiency. This paper presents a multi-layer neural network architecture with novel reconfigurable activation functions by utilizing the coordinate rotation digital computer (CORDIC) technique and applying the floating-point format (IEEE 754 standard in single precision). The functionality was successfully verified in hardware using a DE2-115 board that included an Altera Cyclone® IV FPGA.

R1-4 (Time: 10:26 - 10:28)
 Title Minimization of Energy Consumption of Double Modular Redundancy Design of Conditional Processing by Common Condition Dependency Author *Kazuhito Ito (Saitama University, Japan) Page pp. 18 - 23 Keyword Double modular redundancy, soft error, conditional processing, energy minimization Abstract Double modular redundancy (DMR) is to execute an operation twice and detect soft error by comparing the operation results. The error is corrected by executing necessary operations again. The DMR design for conditional processing is considered in this work. A method is proposed which makes the secondary executions of the duplicated operations be dependent on the primary execution of the condition operation, thereby widening the schedule solution space and allowing better results to be derived. The minimization of energy consumption with the proposed method is formulated as ILP models and the optimum solution is obtained by using an ILP solver.

R1-5 (Time: 10:28 - 10:30)
 Title Application of Overlap-Add FFT Algorithm for Computation Reduction of Convolution Neural Networks Author Hsia-Tsung Wang, *Wei-Kai Cheng (Chung Yuan Christian University, Taiwan) Page pp. 24 - 26 Keyword CNN, FFT Abstract As the computation demand of CNNs is dominated by convolution layers, some researches exploit the duality between spatial domain and frequency domain through fast Fourier transform (FFT) to replace convolutions with pointwise multiplications. However, the FFT approach requires zero padding to enlarge the filter kernel to be the same size of input feature map. In this paper, we apply the overlap-add FFT algorithm to resolve the large zero padding problem in full FFT model. Our approach can fit all filter kernel size, and especially benefit small filter kernel size like 3x3. Experiments on ResNet-34 shows that in average, our overlap-add FFT scheme achieves near to 41% of convolution complexity, and can further reduced to 10% of complexity with circuit optimization.

R1-6 (Time: 10:30 - 10:32)
 Title Improving Global Motion Compensation for Frame Interpolation with High-Resolution and High-Frame-Rate Video Author *Keita Ukihashi, Takashi Imagawa (Ritsumeikan University, Japan), Hiroshi Tsutsui, Yoshikazu Miyanaga (Hokkaido University, Japan), Hiroyuki Ochi (Ritsumeikan University, Japan) Page pp. 27 - 32 Keyword frame interpolation, motion compensation Abstract In this paper, we propose a novel global motion compensation method to be used in frame interpolation from input video that consists of high-resolution less-frequent frames (keyframes) and low-resolution high-frame-rate (LR-HF) frames. To generate better-interpolated background from two keyframes using homography transformation, we improve the accuracy of global motion estimaion by eliminating and interpolating feature point (FP) and by detecting erroneous homography matrix. We also introduce an adaptive weight model for superimposing transformed keyframes. The experimental results show that the proposed method achieves interpolated frames with better quality than the conventional one.

R1-7 (Time: 10:32 - 10:34)
 Title Configurable Processor Hardware Developing Environment for RISC-V with Vector Extension Author *Ryo Taketani (Department of Information Systems Engineering, Osaka University, Japan), Yoshinori Takeuchi (Department of Electric and Electronic Engineering, Kindai University, Japan) Page pp. 33 - 38 Keyword Configurable processor, RISC-V, Vector architecture Abstract This research proposes a configurable processor hardware developing environment for RISC-V with vector extension. RISC-V is getting more attention as an open Instruction Set Architecture. RISC-V has vector extension specified for parallel computing takes power savings and high executed cycle performance into consideration. We challenged to implement a RISC-V based hardware processor with vector extension and evaluated it.

R1-8 (Time: 10:34 - 10:36)
 Title Improved Multiplier Architecture on ASIC for RLWE-based Key Exchange Author *Tatsuki Ono, Song Bian, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan) Page pp. 39 - 40 Keyword ring learning with errors, application specific integrated circuit, cryptography, key exchange, multiplier Abstract The ring learning with errors (RLWE) problem is one of the most promising candidates for constructing quantum-resistant cryptosystems. In this work, we implement an improved hardware multiplier unit for RLWE key exchange schemes. By reducing internal processing units and shortening processing steps, circuit area, power, and latency are reduced to 0.63x, 0.48x, and 0.86x, respectively, compared to the conventional architecture.

R1-9 (Time: 10:36 - 10:38)
 Title Parameter Embedding for Efficient FPGA Implementation of Binarized Neural Networks Author *Reina Sugimoto, Nagisa Ishiura (Kwansei Gakuin University, Japan) Page pp. 41 - 45 Keyword binarized neural network, FPGA implementation, parameter embedding Abstract A binarized neural network (BNN), a restricted type of neural network where weights and activations are binary, enables compact hardware implementation. While the existing architectures for BNN assume that weights and biases are stored in on-chip RAMs, this paper presents an attempt to embed those parameters into processing elements by utilizing LUTs in FPGAs as ROMs. This eliminates the bandwidth limitation between memories and neuron PEs and allows higher parallelism, as well as it reduces the hardware cost of the neuron PEs. This paper also proposes a map-shift scheme to efficiently supply the neuron PEs with feature map data for convolution. As a case study, LeNet5 has been implemented based on this method targeting Xilinx FPGA Artix-7, which can process a frame in 1,386 cycles at 21.1MHz.

R1-10 (Time: 10:38 - 10:40)
 Title A 4CH CNN Hardware Architecture for Image Super-Resolution Author *Koyo Suzuki, Kazuki Mori, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan) Page pp. 46 - 50 Keyword Super-Resolution, CNN Abstract This paper presents two hardware architectures for super-resolution technology with 4CH CNN (convolutional neural network with four output-channels). We introduce time-division processing to save resources. Moreover, we propose a technique to save resources by sharing some part of the circuit in one architecture. Experimental results have shown that the architecture reduces resources by about 4 to 21 pt. compared to the other architecture. Both architectures speed up about 5.5 times as fast as software processing.

R1-11 (Time: 10:40 - 10:42)
 Title Approximate Function Configuration by Neural Network on Memory-array Unit Author *Xuechen Zang, Shigetoshi Nakatake (The University of Kitakyushu, Japan), Hiroyuki Kozutsumi, Mitsunori Katsu (TRL Corp., Japan), Shoichi Sekiguchi (TAIYO YUDEN Co., LTD, Japan) Page pp. 51 - 55 Keyword Approximate Computing, Reconfigurable Systems, MRLD, Approximate Logic Abstract This paper presents approximate computing consistent with a memory-based reconfigurable logic device (MRLD). We propose a novel implementation flow how to realize a function of multiple look up table (MLUT) by employing neural network (NN) based machine learning. Like a function fitting, our method implement a logic function induced by a set of input and output. To verify the performance of approximate computing implementation, we compare a general polynomial regression method and a deep neural networks. The results suggest relatively a deeper NN is superior on loss value and accuracy rate. The NN models achieve lower symbol error rate (SER) and get considerable loss reduction respectively compared to the polynomial regression. Besides, we demonstrate how to use such models for an 8-bit inverter logic example.

R1-12 (Time: 10:42 - 10:44)
 Title A Deep Neuro-Fuzzy for False Decision Prevention on an FPGA Author *Masayuki Shimoda, Hiroki Nakahara (Tokyo Institute of Technology, Japan) Page pp. 56 - 61 Keyword Deep Neural Netwrok, Fuzzy Inference, FPGA Abstract We propose a deep neuro-fuzzy that consists of a deep neural network(DNN) and fuzzy inference. The fuzzy inference judges whether inputs are distinguishable or not from the DNN outputs to avoid critical errors(e.g., recognizing malignancy data as benign one). When our system detects a distinguishable data, it outputs indistinguishable. Experimental results shows that the recall increased by 20.52% in the best case and its area and computation time are almost the same compared with typical DNNs. Thus, our proposal is more suitable for embedded systems under the situations where the error is critical.

R1-13 (Time: 10:44 - 10:46)
 Title A Real Chip Evaluation of a CNN Accelerator SNACC Author *Ryohei Tomura, Takuya Kojima, Hideharu Amano (Dept. of Information and Computer Science, Keio University, Japan), Ryuichi Sakamoto, Masaki Kondo (Graduate School of Information Science and Technology, The University of Tokyo, Japan) Page pp. 62 - 67 Keyword Accelerator, CNN Abstract SNACC (Scalable Neuro Accelerator Core with Cubic integration) is an accelerator for deep neural network, which can improve the performance by increasing the number of stacked chips with inductive coupling wireless through chip interface (TCI). The chip implementation and real chip evaluation of SNACC are introduced. It consists of four processing element cores which executes dedicated SIMD instructions, distributed memory modules for storing weight data, and TCI. The real chip evaluation by using Lenesas Electronics’ 65nm SOTB (Silicon On Thin Box) CMOS technology appears that a simple CNN LeNet works at 50MHz for all layers with 0.90V supply voltage. The power consumption is less than 12mW. The performance can be enhanced by the forward body biasing about 15% in exchange for about 2mW leakage increasing. Also, SNACC archieved more than 20 times high performance to a MIPS R3000 compatible embedded processor.

R1-14 (Time: 10:46 - 10:48)
 Title IMU-based Rehabilitation System for Upper and Lower Limbs Author Chun-Jui Chen, Yi-Ting Lin, Chia-Chun Lin (Department of Computer Science, National Tsing Hua University, Taiwan), Yung-Chih Chen (Department of Computer Science and Engineering, Yuan Ze University, Taiwan), *Chun-Yao Wang (Department of Computer Science, National Tsing Hua University, Taiwan) Page pp. 68 - 73 Keyword Rehabilitation, knee angle, elbow angle Abstract In this work, we present an IMU-based rehabilitation system for upper and lower limbs. This system uses two wearable IMU sensors to detect rehabilitation motions of patients suffering from frozen shoulder, knees, and hip surgeries. The sensors are also connected to a smartphone via Bluetooth, and an Android APP is designed to show the correctness and the statistics of the rehabilitation exercises. The experimental results show that the average errors of knee angle, and elbow angle are both less than 5°. The average recognition rates of all rehabilitation exercises are larger than 85%.

R1-15 (Time: 10:48 - 10:50)
 Title A Smart Single-Sensor Device for Instantaneously Monitoring Lower Limb Exercises Author Yan-Ping Chang, Teng-Chia Wang, Chun-Jui Chen, Chia-Chun Lin (National Tsing Hua University, Taiwan), *Yung-Chih Chen (Yuan Ze University, Taiwan), Chun-Yao Wang (National Tsing Hua University, Taiwan) Page pp. 74 - 79 Keyword stride count, walking distance, 9-axial sensor Abstract Studies have shown that stair exercises can enhance the strength of lower limbs for patients with limb disorders. However, there are only few systems that can monitor the lower limb exercises in the medical institutes. To analyze the lower limb exercises instantaneously, we propose a smart single-sensor wearable device, S3-Sock, equipped on shoes. The sock can monitor and measure the stride count, step height, and the distance of step trajectory about lower limb exercises. The experimental results demonstrate that the proposed system is reliable under different lower limb exercises. The averages of absolute mean errors of stride count in stair-climbing and walking are about 2.00% and 0.88%, respectively. The averages of absolute mean errors of step height are about 5.12% and 8.23% in step-by-step and step-over-step stair climbing, respectively.

R1-16 (Time: 10:50 - 10:52)
 Title 1-D GDR Aware Cell Generation via P/N bi-partition Author Yao-Lin Chang, Hung-Ming Chen, *Wei-Tung Chao, Chien-Hung Lin (National Chiao Tung University, Taiwan) Page pp. 80 - 81 Keyword layout, standard cell Abstract As the complexity of a layout design grows, layout generation problem has been more challenging. This work features the bi-partition tree and the selective stage. With this bi-partition tree, we speed up the layout generation flow and guarantee no additional wire length. With objective functions in the placement selection stage and the routing stage, a lithographyfriendly layout with low congestion, minimum area and high performance is accomplished.

Invited Talk I
Time: 13:20 - 14:10 Monday, October 21, 2019
Chair: Shigeru Yamashita (Ritsumeikan University, Japan)

I1-1 (Time: 13:20 - 14:10)
 Title (Invited Talk) LSI Design and Current Topics for Automotives Author *Toshihiro Hattori (Renesas Electronics, Japan) Page p. 82 Keyword Automotives Abstract Automotive is one of the major applications for the semiconductor devices. And the semiconductor devices are the key factors to support the current innovation of MOBILTY (automotive) systems. Firstly, I will explain the different needs, feature, and technology for automotive oriented LSI’s. As you know, Automotive technology is performing a drastic innovation leaded the key words “CASE (Connected, Autonomous, Shared & Services, Electric” and “MaaS (Mobility as a Service)”. I will overview the trends and needs for automotive LSI’s. Functional Safety and Security is the key technology required current automotive LSI’s. I will explain the trends and background of autonomous driving and show the example of the latest implementation for autonomous driving support LSI’s. I will show the background of the functional safety trends and the example of a 28nm automotive flash microcontroller for next-generation automotive architecture complying with ISO26262 ASIL-D. I will show the background of the security trends in automotive and the example of a 24MB embedded flash system based on 28nm SG-MONOS featuring robust over-the-air software update.

Regular Poster Session II
Time: 14:10 - 15:40 Monday, October 21, 2019
Chairs: Yu-Guang Chen (National Central University, Taiwan), Ching-Hwa Cheng (Feng Chia University, Taiwan)

Outstanding Paper Awards
R2-1 (Time: 14:10 - 14:12)
 Title Insertion Based Procedural Construction of Parallel Prefix Adders Author *Bo-Yu Tseng, Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan) Page pp. 83 - 88 Keyword adder, optimization, binary tree Abstract As a novel approach to the design of parallel prefix adders, the framework of the procedural construction of parallel prefix adders has been proposed. This approach aims to configure the prefix tree structure by the sequence of basic structural operations. Among several basic operations, insertion'' has a potential to produce a variety of prefix structures while keeping the hardware cost low. This paper explores the essential structural variations achieved by insertion operation, and proposes a coding scheme which can represent all these essential variations with excluding redundancy as much as possible. In our approach, we focus on the sequence of insertion operations applied at various positions, and propose to use a binary tree to specify the order of applying insertion operations. Our discussions in this paper would be an important base for the optimization of parallel prefix adder, which is one of our future works.

R2-2 (Time: 14:12 - 14:14)
 Title 3D Test Wrapper Chain Synthesis for Test Time and TSV Count Co-optimization under Constraints on I/O Cells Author Fan-Hsuan Tang, Hsu-Yu Kao, *Shih-Hsu Huang (Chung Yuan Christian University, Taiwan) Page pp. 89 - 94 Keyword SoC Testing, Test Wrapper Chain Synthesis, Design for Testability, TSV Count Minimization, 3D ICs Abstract In addition to test time minimization, the number of testing TSVs is also an important concern for the 3D test wrapper chain synthesis problem. Previous co-optimization algorithms only can work under no constraints on I/O cells. In this paper, we propose a single-stage KL (Kernighan-Lin) based algorithm to overcome this drawback. Different from previous works, the proposed synthesis algorithm can take specified I/O cells constraints into account during co-optimization. Benchmark data consistently show that the proposed algorithm can greatly reduce both test time and TSV number.

R2-3 (Time: 14:14 - 14:16)
 Title A New Approach to Express Stochastic Numbers Author *Yukino Watanabe, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 95 - 98 Keyword Stochastic Computing, Stochastic Numbers Abstract Stochastic Computing (SC) is a technique to calculate complex functions with very small hardware overhead when we can allow some small errors. SC uses Stochastic Numbers (SNs) which are generally long (e.g., 1024) bit string; we need many cycles to calculate a function with SNs. In this paper, we propose a novel idea to reduce the length of SNs while the precision level of SNs is not changed. Our idea is to express one SN by using two bit-strings, and the two bit-strings has different weights. The multiplication of two SNs by our expression is not trivial. So we propose how to multiply two SNs by our new expressions. Then we show some experimental results to confirm that our proposed multiplication can provide almost similar error rate as the conventional SNs with significantly small length of bits.

R2-4 (Time: 14:16 - 14:18)
 Title Rapid Single-Flux-Quantum Matrix Multiplication Circuit Utilizing Bit-Level Processing Author *Nobutaka Kito, Takuya Kumagai (Chukyo University, Japan), Kazuyoshi Takagi (Mie University, Japan) Page pp. 99 - 103 Keyword matrix multiplication, RSFQ circuits Abstract A rapid single-flux-quantum (RSFQ) matrix multiplication circuit utilizing bit-level processing is presented. The proposed circuit utilizes characteristics of pulse logic used in RSFQ circuits and utilizes bit-level processing. The circuit carries out multiplications and additions by counting pulses on signal lines. It uses fewer gates compared with previously proposed parallel processing designs and could be realized in small layout area. A layout for 4-bit 4 x 4 matrix multiplication was designed and its correct operation was verified in simulation.

R2-5 (Time: 14:18 - 14:20)
 Title Irregular Bumps Design Planning for Modern Ball Grid Array Packages Author Hsin-Yu Chang, Jyun-Ru Jiang, Simon Chen, Hung-Ming Chen, *Ya-Ying Chien (National Chiao Tung University, Taiwan) Page pp. 104 - 109 Keyword flip-chip packages, routability Abstract In modern flip-chip packages, bumps are often placed irregularly due to different design needs. It costs a great amount of time and manual effort to generate substrate routing from bumps through vias to package balls. Moreover, any single model in prior works could not be simultaneously applied between bumps, vias and balls. In this work, we propose a hybrid flow network model to formulate the 2-layer substrate routing problem on irregular package structure. We present a new bump model that can handle irregular bump plans. With our methodology, signal assignment on vias and balls, and substrate routing on two layers can be obtained at the same time. We also present an iterative optimization technique to improve wire congestion. Our results show that the proposed method completes via and ball assignment efficiently, and obtain 100% routability and an average wirelength improvement of 16.45%, compared with manual design in real industrial cases.

R2-6 (Time: 14:20 - 14:22)
 Title Droplet Splitting Routing for Micro-Electrode-Dot-Array Digital Microfluidic Biochips Author *Ikuru Yoshida, Kota Asai (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Tsung-Yi Ho (National Tsing Hua University, Japan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 110 - 115 Keyword biochips, droplet routing, micro-electrode dot array Abstract Digital micro fluidic biochips (DMFBs) is one of the most promising technologies to use for sample preparation. Among them, DMFBs based on micro-electrode dot array (MEDA) is the technology overcoming the drawback of a conventional DMFB. On MEDA based biochips, we can perform droplet shaping and splitting operations that cannot be performed on a conventional DMFB. In this paper, we propose an efficient droplet routing method by splitting droplets in MEDA when there are multiple spaces between block regions. We confirm by our experiment that our method indeed can reduce the necessary time steps for droplets to reach target regions.

R2-7 (Time: 14:22 - 14:24)
 Title Exploring Time-space Trade-off for Application Mapping onto 3-D Torus NoCs Author *Yao Hu, Michihiro Koibuchi (National Institute of Informatics, Japan) Page pp. 116 - 117 Keyword Network-on-Chip (NoC), topology embedding, interconnection network, job mapping Abstract One application usually has many parallel tasks running on multiple processing cores which communicate with each other on a many-core chip. Traditionally, the tasks are mapped onto a regular topology of network-on-chip (NoC) with nearby processing cores to reduce the network distances. In this case, fragmentation of unused processing cores may occur when receiving a new incoming application on a chip. In this study, we assume that each application has to be executed on a pre-fixed network topology on a many-core chip with 3-D torus NoC. To improve the system utilization, i.e. reducing a number of unused processing cores, we allow to use non-adjacent processing cores for an application mapping, which form a pre-fixed network topology. We evaluate the time-space trade-off during node allocation with different mapping dilations for the purpose of improving job scheduling abilities. Evaluation results show that, for a large compound workload of NAS Parallel Benchmarks (NPB) applications, the proposed mapping can reduce up to 6% of turnaround time when compared with the regular topology mapping on a large 3-D torus NoC.

R2-8 (Time: 14:24 - 14:26)
 Title On Power Supply Pads Planning for Wire-bonded IC Author Hui Zhong Leong, *Ming-Yu Huang, Hung-Ming Chen (NCTU Taiwan, Taiwan), Chang-Tzu Lin (ITRI Taiwan, Taiwan) Page pp. 118 - 121 Keyword power supply, pdn, wire-bonded ic Abstract In wire-bonding technology, Input/Output (I/O) pads are located along the peripheral of integrated circuit (IC) and power pad placement is limited by available I/O pad candidates. Power pads supply voltage to the IC through power delivery network (PDN), hence insufficient power pads may cause IC failure. To overcome this problem, we propose a power pad placement algorithm for wire-bonding technology. Experimental results show that the proposed algorithm determines both power pad counts and power pad locations effectively for a given power delivery network. In addition, the worst voltage drop for the IC is guaranteed to be less than 3% of the supply voltage.

R2-9 (Time: 14:26 - 14:28)
 Title Sample Preparation with Efficient Dilution of Biochemical Fluids using Programmable Microfluidic Devices Author *Ying Shuaijie (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Sudip Roy (Indian Institute of Technology (IIT) Roorkee, India), Juinn-Dar Huang (National Chiao Tung University, Taiwan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 122 - 125 Keyword PMD, Sample preparation, two steps, small area Abstract Sample preparation, which is a front-end process to produce the desired target concentrations of the input reagent fluid, plays a pivotal role in every bioassay or biochemical laboratory protocol. In this paper, we propose two sample preparation algorithms for efficient dilution of biochemical fluids using programmable microfluidic devices (PMDs). The first method is called as dilution algorithm in two steps (DATS), which needs only two diluting operations. Whereas, the other method is called as dilution algorithm in a small dilution area (DASDA), which needs less area compared to that by DATS.

R2-10 (Time: 14:28 - 14:30)
 Title An Efficient Character Generation Algorithm for High-Throughput E-Beam Lithography Author *Shih-Ting Lin, Hong-Yan Su (National Chiao Tung University, Taiwan), Oscar Chen (AnaGlobe Technology, Inc, Taiwan), Yih-Lang Li (National Chiao Tung University, Taiwan) Page pp. 126 - 131 Keyword Character projection E-beam lithography, exact pattern matching, frequently used character, multi-intersection-level layout Abstract E-beam lithography has been one of promising next generation lithography for 7nm and below technology nodes. Among vari-ous electron-beam lithography features, character projection (CP) attracts users because complex patterns can be printed in one e-beam shot. However, we still face severe challenges of gen-erating characters on interconnection layers due to its pattern diversity. In this paper, we proposes a multi-intersection-level (MIL) layout that can efficiently capture the relationships be-tween nearby objects including the spacing between them. The inflated layer reduces the problem instance size for identifying the frequently used patterns while the intersection layers help in clipping windows to obtain ideal character set. Experimental results show that the proposed methodology can efficiently yield the frequently used character set with up to 93.3% and 81.23% covering rate in via layer and metal layer. Besides, for a panel layout, a set of frequently used characters to reach 100% cov-ering rate is successfully identified.

R2-11 (Time: 14:30 - 14:32)
 Title Color Balancing-aware Non-Stitch Routing for Multiple Patterning Lithography Author *Jia-Hong Chang, Shao-Yun Fang (National Taiwan University of Science and Technology, Taiwan) Page pp. 132 - 135 Keyword Multiple Patterning Lithography, Color Balancing, Routing Abstract Multiple Patterning Lithography (MPL) is one of the major resolution enhancement technologies for sub-20 nm nodes, which requires to decompose a layout into multiple masks considering the minimum mask spacing rule. In this paper, we propose an MPL-aware routing algorithm considering mask usage balancing to optimize pattern printability. Different from previous works, stitch insertion is not considered in our router since stitches are usually forbidden in industry to guarantee sufficient yield. To maximize the flexibility in mask usage optimization that is deficient for non-stitch routing, a multiple-objective minimum spanning tree algorithm (MO-MST) is proposed to make the distribution of generated wire segments more scattered. An integer linear programming (ILP)-based color refinement approach is also proposed to optimize mask usage balancing. Experimental results show that the proposed algorithm flow can generate MPL-compliant routing solutions with excellent mask usage balancing for the benchmarks released by 2018 CAD Contest at ICCAD.

R2-12 (Time: 14:32 - 14:34)
 Title An Efficient and Effective Macro Placement Algorithm for Large-Scale Mixed-Size Designs Author Jai-Ming Lin, You-Lun Deng, Ya-Chu Yang, *Jia-Jian Chen (Department of Electrical Engineering, National Cheng Kung University, Taiwan) Page pp. 136 - 137 Keyword macro placement, simulated evolution, physical design, design hierarchy, mixed-size Abstract We propose a novel approach which integrates the simulated evolution algorithm and corner stitching data structure. Unlike the simulated annealing algorithm which existing works adopt, our approach prevents a solution from getting stuck at a local optimal solution but takes smaller runtime. Even though a chip contains several preplaced macros and may not abutted to chip boundaries, our approach is able to be handled these situations. Experimental results show that our approach obtains better results in wirelength, routability, and runtime.

R2-13 (Time: 14:34 - 14:36)
 Title Thermal Modeling and Simulation of a Smart Wrist-worn Wearable Device Author *Kodai Matsuhashi (Hirosaki University, Japan), Koutaro Hachiya (Teikyo Heisei University, Japan), Toshiki Kanamoto, Masasi Imai, Atsushi Kurokawa (Hirosaki University, Japan) Page pp. 138 - 143 Keyword wearable device, thermal design, smart watch Abstract We propose a thermal-circuit model that can calculate temperatures in important places for thermal designs of smart wrist-worn wearable devices. The thermal model can be applied to various wrist-worn wearable devices, which consist of different device-body shapes, belt sizes, and materials. The temperatures obtained using the proposed model agree well with those obtained by a commercial thermal solver. Moreover, by simulations applying the model, we present important knowledge for thermal designs of wrist-worn wearable devices.

R2-14 (Time: 14:36 - 14:38)
 Title Mixing of Biochemical Fluids using Programmable Microfluidic Devices Author *Yuto Umeda (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Sudip Roy (Indian Institute of Technology (IIT) Roorkee, India), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 144 - 149 Keyword programmable microfluidic device, the number of mixing operations, assigning reagents Abstract A programmable microfluidic device (PMD) can mix the reagents in various ratios. In this paper, we propose a mixing method to reduce the number of mixing operations on PMDs. Our method finds the best assignment of each reagent to each mixing operation so that we can reduce the number of mixing operations by simplifying the ratio of reagents and reusing intermediate waste reagent. Experimental results show that our proposed method can make mixing trees with the smallest number of mixing operations.

R2-15 (Time: 14:38 - 14:40)
 Title Generalized Via Pattern Awareness Substrate Routing Framework for Fine Pitch Ball Grid Array Author Jun-Sheng Wu, Chi-An Pan, *Yi-Yu Liu (National Taiwan University of Science and Technology, Taiwan) Page pp. 150 - 151 Keyword Routing, ILP Abstract Packaging substrate has become one of the most important carriers to enable system-level and heterogeneous design within a small footprint size. Instead of applying advanced semiconductor interposer process technologies, the fine pitch ball grid array (FBGA) package substrates are manufactured by mechanical processes. To tackle stringent design rules owing to the mismatched via dimension and miscellaneous routing obstacles, substrate interconnect designs are usually customized by experienced substrate layout engineers. However, fully net-by-net manual design for hundred-scale FBGA is time consuming and error-prone. In this paper, we model the FBGA substrate routing as an integer linear programming (ILP) problem taking various via patterns and design-dependent constraints into account. Two-stage early exit methodology and ILP constraint reduction techniques are developed to boost the runtime of ILP solver. Experimental results indicate the potential of the proposed framework. We argue that complex FBGA designs could be semi-automated by using via pattern candidates to reduce the substrate layout design cycle time.

R2-16 (Time: 14:40 - 14:42)
 Title Acceleration of Radix-Heap based Dijkstra algorithm by Lazy Update Author Tomohiro Takahashi (University of Kitakyshu, Japan), *Yasuhiro Takashima (University of Kitakyushu, Japan) Page pp. 152 - 157 Keyword Dijkstra's algorithm, Lazy update, Radix-heap Abstract This paper proposes a fast Dijkstra algorithm with radix-heap by lazy update which solves the single source shortest path problem (SSSP). The conventional Dijkstra algorithm chooses one vertex with the minimum tentative distance among the unvisited vertices. For the problem, the relaxation of the number of selected vertices not only one but also multiple under the guarantee of its optimality has been proposed, called lazy update. In this paper, we utilize this lazy update method to the radix-heap based Dijkstra which solves SSSP with the integer edge distances. The experimental results confirm the efficiency of the proposed method which execute 50 % faster than the conventional Dijkstra.

R2-17 (Time: 14:42 - 14:44)
 Title A Global Placement Method for RECON Spare Cells in ECO-Friendly Design Style Author *Junpei Akashi, Suguru Hojo, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan) Page pp. 158 - 163 Keyword ECO, reconfigurable cell, error diagnosis, technology remapping Abstract This paper presents an approach to obtain suitable global placement of RECON spare cells in the ECO (Engineering Change Order)-friendly design style based on the statistics with each subregion concerning critical and near-critical paths, occupancy of RECON embedded cells, and utilization of RECON cells. Experimental results have shown that the proposed method is effective to fix post-mask ECO’s suppressing increase in the maximum delay time compared with the conventional approach.

R2-18 (Time: 14:44 - 14:46)
 Title An Efficient Thermal Model of Thin Film NiCr Resistors Considering Pulse Response Author *Ryosuke Watanabe (Hirosaki University, Japan), Keita Izawa (Nikkohm Co., Ltd., Japan), Shota Kajiya, Daiki Tsunemoto, Koki Kasai, Atsushi Kurokawa, Toshiki Kanamoto (Hirosaki University, Japan) Page pp. 164 - 167 Keyword Thin film resistors, Thermal circuit analysis Abstract This paper proposes an efficient thermal model of an industrial thin film NiCr resistors. We considered the thermal destruction effect of the thin film NiCr resistors for high pulsed power incident condition. The thin film NiCr resistors considered in this study have two types of thermal time constant. TCAD calculation indicates that the short thermal time constant around 55 $\mu$s exist in the resistors, and experimental results indicate that long thermal time constant around 40 seconds exist. Therefore, to analyses the thermal transient behaviors of the resistors more precisely, we propose the thermal circuit model that includes both the short and long thermal time constant. In the model, thermal resistance and heat capacitance of the thin NiCr sheet are precisely considered, and these parameters are quite important for the existence of short thermal time constant. Existence of the short thermal time constant in this model strongly related to the peak temperature of the considered resistors, and we think that the short time thermal response of the thin film NiCr resistors is related to the pulse durability of the resistors.

R2-19 (Time: 14:46 - 14:48)
 Title A Smart Knee Pad for Stride Count and Walking Distance Measurement via Knee Angle Calculation Author Teng-Chia Wang, Yan-Ping Chang, Chun-Jui Chen, *Chia-Chun Lin (National Tsing Hua University, Taiwan), Yung-Chih Chen (Yuan Ze University, Taiwan), Chun-Yao Wang (National Tsing Hua University, Taiwan) Page pp. 168 - 173 Keyword knee angle, stride count, walking distance, 9-axial sensor Abstract To calculate the knee angle, stride counts, and walking distance, we propose a system, iKneePad, fusing two 9-axis sensors with Bluetooth equipped on the thigh and shank segments. The changing rates of hip and knee angles are used to determine the beginning and the ending of a stride. The thigh length, shank length, hip angle, and knee angle are used to calculate the walking distance. The experimental results show that the accuracy of stride count is 100%, the absolute mean errors of knee angle are 2.99 and 1.42 for the maximum and minimum flexion angles, respectively. For walking distance, the mean error rates are -2.40% and -2.26% for short (10m) and long (33m) distances, respectively. The proposed system also instantly provides feedback to users by showing on an Android smartphone when conducting rehabilitation or exercise with iKneePad.

Panel Discussion
Time: 15:40 - 17:10 Monday, October 21, 2019
Moderator: Hung-Ming Chen (National Chiao Tung University, Taiwan)

D-1 (Time: 15:40 - 17:10)
 Title (Panel Discussion) Quo Vadis, EDA? Author Moderator: Hung-Ming Chen (National Chiao Tung University, Taiwan), Panelists: Krishnendu Chakrabarty (Duke University, USA), Ulf Schlichtmann (Technische Universität München, Germany), Toshihiro Hattori (Renesas Electronics, Japan), Pai H. Chou (National Tsing Hua University, Taiwan), Akira Fujimaki (Nagoya University, Japan), Donald Lie (Texas Tech University, USA), Organizer: Tsung-Yi Ho (National Tsing Hua University, Taiwan) Page p. 174 Keyword EDA Abstract Nowadays electronics and biomedical designs/applications have been facing critical moments, including the end/extension of Moore's law, killer applications and sustainability issues, etc. How to leverage all possible solutions in design and tools development including the employment of AI is thus essential. In this panel, we have six international researchers leading the discussion in the fields of biomedical, optical designs, automotives, 5G/IoTs, and quantum computing, figuring out how EDA can help shape the future designs.

 Tuesday, October 22, 2019

Keynote Speech II
Time: 9:20 - 10:20 Tuesday, October 22, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

K2-1 (Time: 9:20 - 10:20)
 Title (Keynote Speech) EDA for Optical Networks-on-Chip (ONoCs): Achievements and Future Opportunities Author *Ulf Schlichtmann (Technische Universität München, Germany) Page p. 175 Keyword Optical NoCs Abstract Optical Networks on Chip (ONoCs) are a promising technology to resolve some issues which are increasingly plaguing traditional electrical NoCs. Excessive power consumption is chief among these issues. As researchers started looking into architectural options for ONoCs, it soon became apparent that Electronic Design Automation (EDA) would be very beneficial to improve such architectures and especially their physical implementation, e.g. due to the complexity involved. This is true already on a netlist level, but even more so once physical design is considered. Thus, since about 10 years, researchers have started working on EDA approaches for the design of ONoCs. I will review some achievements of EDA for ONoCs, with a focus on physical design (placement, routing). I will discuss current challenges in further improving EDA results. This will be followed by a look at opportunities how EDA research can further improve ONoC architectures. Opportunities exist especially in simultaneously considering multiple design aspects. The emphasis in this talk will be on Wavelength-Routed ONoCs (WRONoCs).

Regular Poster Session III
Time: 10:20 - 11:50 Tuesday, October 22, 2019
Chairs: Yukihide Kohira (University of Aizu, Japan), Yasuhiro Takashima (University of Kitakyushu)

Outstanding Paper Awards
R3-1 (Time: 10:20 - 10:22)
 Title Efficiency Investigation of Capacitors Mounted on Re-distribution Layers for FOWLP Author *Koki Kasai, Atsushi Kurokawa, Masashi Imai, Toshiki Kanamoto (Hirosaki University, Japan) Page pp. 176 - 179 Keyword PDN, Impedance, Capacitance, FOWLP Abstract This paper provides insights on effective usage of an emerging decoupling capacitor. Power supply noise is one of the most serious concerns in the modern low voltage integrated circuits. Decoupling capacitors embedded in the re-distribution layers (RDL) are potentially effective to reduce the noise caused by the internal switching. However, the effectiveness of them is easily lost due to the equivalent series inductance and resistance. Here, we construct a post-layout simulation test bench to discuss the effectiveness by evaluating impedance profile as well as transient noise waveform. The experimental results show that the horizontal proximity of the RDL embedded capacitors to the noise source is an important factor to keep the advantage.

R3-2 (Time: 10:22 - 10:24)
 Title Unbalanced Splitting Tolerant Sample Preparation Algorithm for Digital Microfluidic Biochips Author Ling-Yen Song, Yi-Ling Chen, Yung-Chun Lei, *Juinn-Dar Huang (Institute of Electronics, National Chiao Tung University, Taiwan) Page pp. 180 - 183 Keyword digital microfluidic biochip, sample preparation, unbalanced splitting, probability-based forecast, forecast-based correction Abstract Sample preparation is regarded as one of necessary processing steps in most biochemical assays. In the past decade, several techniques have been presented to deal with sample preparation issues under the (1:1) mixing model on digital microfluidic biochips (DMFBs). Most of previous works assumed that mixing-then-splitting would get two identical output droplets. However, due to uncontrollable variabilities, previous works may fail to provide exact solutions as the present of unbalanced splitting. In this paper, we propose a new forecast-based correction algorithm for unbalanced splitting problem. Our new algorithm not only guarantees a correct solution, but requires neither extra reactants nor on-chip specialized hardware. Experimental results show that the effect of unbalanced splitting can be eliminated only at the cost of 20% more operation steps. Therefore, the proposed algorithm is both reliable and efficient.

R3-3 (Time: 10:24 - 10:26)
 Title KR-CHIP: An Educational Computer equipped with 8-bit Accumulator-based, 16-bit Accumulator-based and 32-bit Pipeline Processors Author Hiroyuki Kanbara (ASTEM RI, Japan), Kagumi Azuma, Yuuki Oosako (Kwansei Gakuin University, Japan), Atsuya Shibata (Nara Institute of Science and Technology, Japan), *Wakako Nakano (Kwansei Gakuin University, Japan) Page pp. 184 - 189 Keyword Education, CPU, FPGA, Accumulator-based, Pipeline Abstract This article presents a processor for computer education named KR-CHIP. KR-CHIP integrates 3 CPUs: 8-bit accumulator-based, 16-bit accumulator-based and 32-bit pipeline architecture. Every register, counter, flag and memory can be observed directly by hardware at any clock cycle or at any phase of instruction execution. KR-CHIP is useful for beginners of computer hardware to understand how instructions are processed inside a CPU.

R3-4 (Time: 10:26 - 10:28)
 Title A Trial of Electric Chemical Degradation Process Simulation for Lead-acid Batteries Author *Daiki Imai, Masahiro Fukui (Ritsumeikan University, Japan), Keiichi Hasegawa (Plan Be, Japan) Page pp. 190 - 191 Keyword Battery Management, Simulation, Optimization, Lead-acid Battery Abstract A trial of computer simulation for degradation of lead-acid battery is examined by the concepts of reaction distance. The recovery rate depends on the time of charge after discharge, the reaction distance, and the particle diameter of PbSO4 salts.

R3-5 (Time: 10:28 - 10:30)
 Title Register Minimization in Double Modular Redundancy Design with Soft Error Correction by Replay Author *Yuya Kitazawa (Saitama University, Japan), Shinichi Nishizawa (Fukuoka University, Japan), Kazuhito Ito (Saitama University, Japan) Page pp. 192 - 197 Keyword Double modular redundancy, soft error, register minimization Abstract Double modular redundancy (DMR) is to execute an operation twice and detect soft error by comparing the duplicated operation results. The soft error is corrected by executing necessary operations again, called replay. The replay requires error-free input data and registers are needed to store such necessary error-free data. In this paper, a method to minimize the required number of registers is proposed where replay intervals are appropriately selected so as not to increase the register requirement. The experimental results show up to 27% reduction of required registers.

R3-6 (Time: 10:30 - 10:32)
 Title Comparison of Diagnostic Performance Metrics for Test Point Selection in Analog Circuits Author *Koutaro Hachiya (Teikyo Heisei University, Japan), Atshushi Kurokawa (Hirosaki University, Japan) Page pp. 198 - 203 Keyword Analog Test, Diagnostic Performance Metric, 3D-IC, Through Silicon Via Abstract Diagnostic performance metrics proposed in literature for finding measurement points in analog circuits are compared in terms of four properties: related to test metrics, sensitivity, symmetric and parametric. According to the comparison result, the guideline for metrics selection is proposed. As a case study, the metrics are applied to finding measurement points to detect open defects of through silicon vias in power distribution networks of 3D-ICs.

R3-7 (Time: 10:32 - 10:34)
 Title A 12-bit 500-kS/s SAR ADC with Reconfigurable Mismatch Tolerance Author *Yu-Hsiang Nien, Tsung-Heng Tsai (National Chung Cheng University, Taiwan) Page pp. 204 - 207 Keyword SAR ADC Abstract This paper presents an energy-efficient 12-bit 500-kS/s SAR ADC with reconfigurable mismatch tolerance for high-resolution wearable biomedical sensor networks. Switching-back is used to create a tolerance range of 1/4Vref per bit. Reconfigurable mismatch tolerance (RTM) is assigned for each bit independently to compensate process variations. In this work, the unit capacitance is 1 fF. This SAR ADC consumes 39.5 μW at 500-kS/s under a 1 V supply in 65 nm CMOS process. It achieves a signal-to-noise and distortion ratio of 64.79 dB. The effective number of bits (ENOB) is 10.4 bits, resulting in figure of merit of 55.6 fJ/conversion-step. The implemented prototype occupies an active area of 0.178 mm2.

R3-8 (Time: 10:34 - 10:36)
 Title High-level synthesis code optimization with loop fusion based on LLVM/Polly Author *Yuta Hiyama, Takayuki Todokoro, Kenshu Seto (Tokyo City University, Japan), Masato Tatsuoka (Socionext Inc., Japan Advanced Institute of Science and Technology, Japan), Yoshihito Nishida (Socionext Inc., Japan), Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan) Page pp. 208 - 213 Keyword Loop fusion, Polyhedral model, High-level synthesis, LLVM, Polly Abstract Loop fusion is an effective loop optimization for high-level synthesis. Loop fusion can be performed automatically with an LLVM-based polyhedral compiler called Polly. However, Polly's loop fusion algorithm may output a loop structure unsuitable for high-level synthesis. We implemented an algorithm that uses Polly to output a loop structure suitable for high-level synthesis. The proposed method reduced the average number of execution cycles for high-level synthesis by 33.4% compared to that before loop fusion.

R3-9 (Time: 10:36 - 10:38)
 Title Ultra Low Current Measurement with On-chip High Resistance of MOSFET Array Author *Xinghuai Zhang, Daishi Isogai, Takaaki Shirakawa, Shigetoshi Nakatake (The University of Kitakyushu, Japan) Page pp. 214 - 217 Keyword On-chip High Resistance, Ultra Low Current, Sensor Abstract We propose on-chip high resistance using MOSFET array. We adopt the potentiostat method as an electrochemical sensing to measure ultra low current being aware of biosensing and implant sensing. The sensor circuit includes a high resistance array which is configured by connecting unit resistors in series and parallel. We verify the DC characteristics, the area, and the temperature characteristics of the resistor array by the SPICE simulation, then demonstrate the promising result compared with the conventional Poly resistance

R3-10 (Time: 10:38 - 10:40)
 Title A Note on Optimization Algorithms for FF/Latch-Based High-Level Synthesis Author *Keisuke Inoue (International College of Technology, Kanazawa, Japan) Page pp. 218 - 222 Keyword high-level synthesis, latch Abstract This paper presents a new design framework for register-transfer-level data-paths. The conventional D-flip-flop-based register (D-REG) is very practical, since the designers can concentrate only on the timing constraints between registers. However, with the development of deep sub-micron technology and the increase in the data length, the D-REG hardware cost is becoming relatively larger than the other hardware resources. Thus, latch-based design methods have been proposed as alternatives to D-REG-based design methods, since the latch-based register has smaller hardware cost than D-REG. A disadvantage of the conventional latch-based architecture is the increase in the hardware resources. As a result, the total register cost cannot be fully reduced. We propose a new design framework, a kind of level-triggered latch design, in which a D-REG is replaced by a pair of latch-based registers: a master latch-based register (M-REG) and a slave latch-based register (S-REG).

R3-11 (Time: 10:40 - 10:42)
 Title FPGA Implementation for WDF-Based Analog Emulator with Complicated Topology Author Hsin-Ju Hsu (National Chiao Tung University, Taiwan), Ji-Xuan Tsai, Meng-Lin Li (National Central University, Taiwan), *Chien-Nan Liu (National Chiao Tung University, Taiwan), Jing-Yang Jou (National Central University, Taiwan) Page pp. 223 - 226 Keyword WDF, analog emulation, FPGA, system verification Abstract System verification is still a big challenge for system-on-chip (SoC) designs with AMS circuits. Wave digital filter (WDF)-based approach is a possible solution to emulate analog circuits in existing FPGA with digital circuits. In order to solve the loop problem in WDF structures, a special J-type adaptor was proposed. However, the automatic transformation flow and corresponding FPGA implementation flow with this new J-type adaptor is not discussed in previous papers. Therefore, this paper focuses on the hardware implementation issues for WDF-based analog emulators with J-type adaptor. The FPGA results on several circuits with nonlinear elements have demonstrated the effectiveness and feasibility of the proposed solution for supporting various circuit types on an FPGA-based platform.

R3-12 (Time: 10:42 - 10:44)
 Title Binary Synthesis from RISC-V Executables Author *Shoki Hamana, Nagisa Ishiura (Kwansei Gakuin University, Japan) Page pp. 227 - 228 Keyword high-level synthesis, binary synthesis, RISC-V Abstract This paper presents an implementation of a binary synthesizer which converts a given executable binary code of RISC-V into hardware functionally equivalent to a RISC-V core executing the code. A CPU core and an instruction memory are replaced by the synthesized hardware, which reduces execution time and hardware size for small scale programs. A given binary code is disassembled and parsed to build a control dataflow graph (CDFG), then traditional high-level synthesis techniques are applied to generate RT level Verilog HDL. For a small example program consisting of 34 through 160 instructions, synthesized hardware on Xilinx FPGA Artix-7 took about 74.5% less cycles than on RISC-V Rocket core, with smaller number of LUTs.

R3-13 (Time: 10:44 - 10:46)
 Title Detection of Vulnerability Guard Elimination by Compiler Optimization Based on Binary Code Comparison Author *Yuka Azuma, Nagisa Ishiura (Kwansei Gakuin University, Japan) Page pp. 229 - 230 Keyword software security, compiler optimization, undefined behavior, binary comparison, buffer overflow Abstract It is known that guards against vulnerabilities in C programs might be eliminated by compiler optimization if they are not written properly. This paper proposes a method to detect such flaws in software by binary code comparison. Given a source code, a pair of binary codes are generated, one with standard optimization and the other with problematic optimization suppressed. Since simple comparison of the binary codes end up with an unacceptable amount of false positives, call instructions in each function are collated to detect discrepancies. In a preliminary experiment on 7 programs, our method successfully detected 2 instances of guard losses with only one false positive.

R3-14 (Time: 10:46 - 10:48)
 Title A Stable Equivalent Circuit Identification Algorithm for Li ion Batteries Author *Lei Lin, Masahiro Fukui (Ritsumeikan University, Japan) Page pp. 231 - 236 Keyword SOC, Estimation, Parameter, RLS, EKF Abstract This paper discusses the equivalent circuit parameter and state synchronous estimation method for Li-ion battery. In the conventional method, accuracy and stability are hard to improve. In order to solve this problem, we proposed a solution of the equivalent circuit parameter and state synchronous estimation with feedback. In this paper, we will introduce to the effectiveness of this solution through experiments.

R3-15 (Time: 10:48 - 10:50)
 Title An Intravesical Urine Volume Sensor Robust to Body Posture and Movement Author *Ryousuke Sakai, Shigetoshi Nakatake (The University of Kitakyushu, Japan) Page pp. 237 - 238 Keyword Biomedeical sensor, AC impedance method, Interavesical urine volume, IoT device Abstract In this work, in order to prevent urinary incontinence, we aim to estimate the urination condition from the body water amount in the vicinity of the bladder. Our sensor has a good robustness to body posture and movement by applying the AC impedance method to the bladder. We implement an impedance-based prototype system and experiment to estimate intravesical urine volume. As a result, we are confirmed that the impedance value decreased according to time after drinking water. In addition, we compare the measurement results with the commercial ultrasonic monitoring system and verify the robustness of our proposed system to body posture and movement.

R3-16 (Time: 10:50 - 10:52)
 Title Test Pattern Generation for Timing Faults in Rapid Single-Flux-Quantum Circuits Author *Kazuyoshi Takagi (Mie University, Japan), Mikihiro Ono (Kyoto University, Japan), Nobutaka Kito (Chukyo University, Japan), Naofumi Takagi (Kyoto University, Japan) Page pp. 239 - 243 Keyword Superconducting RSFQ circuits, test pattern generation, timing faults, fault detection, fault diagnosis Abstract A new fault model and test pattern generation methods considering characteristics of superconducting Rapid Single-Flux-Quantum (RSFQ) logic circuits are presented. We define a timing fault model for RSFQ circuits by focusing on the order of pulse arrivals at each clocked logic gate. Subject to the fault model, we propose test pattern generation methods for fault detection and fault diagnosis of RSFQ circuits.

R3-17 (Time: 10:52 - 10:54)
 Title Incremental Approaches for Locating Design Errors: Averaging EPI-Groups and Generating Additional Input Patterns Author *Shogo Ohmura, Hiroshi Nakano, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan) Page pp. 244 - 249 Keyword error diagnosis, ECO, PLEM, EPI Abstract This paper presents two kinds of incremental approaches for locating design errors: averaging EPI-groups and generating additional input patterns to reduce EPI values used for extraction of error location sets in order to shorten the processing time. The experimental results have shown that the proposed techniques are effective to reduce the number of initial error location sets by 96.8% or more, and to shorten the processing time by 86.6% or more.

Invited Talk II
Time: 13:20 - 14:10 Tuesday, October 22, 2019
Chair: Chia-Heng Tu (National Cheng Kung University, Taiwan)

I2-1 (Time: 13:20 - 14:10)
 Title (Invited Talk) IoT for Enabling Precision Medicine Author *Pai H. Chou (National Tsing Hua University, Taiwan) Page p. 250 Keyword IoT Abstract IoT technologies have the potential of revolutionizing medicine by enabling precision diagnostics and treatment. Medical misdiagnoses are frequently caused by over-reliance on patients' biased recall and by measurement limited to the clinical settings. Doctors also have little control over follow-up treatment prescribed for outside the clinic. These limitations can be overcome by a combination of wearable medical and non-medical IoT devices that produce objective, unbiased data from or around the patient. This talk presents a number of case studies on the design of such IoT devices to enable precision medicine, including cardiovascular and pulmonary applications.

Regular Poster Session IV
Time: 14:10 - 15:40 Tuesday, October 22, 2019
Chairs: Chien-Nan Liu (National Chiao Tung University, Taiwan), Lih-Yih Chiou (National Cheng Kung University, Taiwan)

Outstanding Paper Awards
R4-1 (Time: 14:10 - 14:12)
 Title A Case Study on Design of Approximate Multipliers for MNIST CNN Author *Kenta Shirane, Takahiro Yamamoto, Hiroyuki Tomiyama (Ritsumeikan University, Japan) Page pp. 251 - 255 Keyword Approximate Computing, Approximate Multiplier, CNN, MNIST Abstract In this paper, we present a case study on approximate multipliers for MNIST CNN. We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST recognition, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. We further reduce area and delay of the multipliers with keeping high accuracy in MNIST CNN.

R4-2 (Time: 14:12 - 14:14)
 Title A Layout Design Method of QCA without Fixing Data Flow Author *Kazuki Morita, Wakaki Hattori, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 256 - 261 Keyword Quantum-dot Cellular Automata, clocking scheme, Field-Coupled Nanotechnology Abstract Quantum-dot Cellular Automata (QCA) is a promising nanotechnology with ultra-low power consumption and high clock rates. Thus, QCA overcomes the physical limitation of conventional technologies like CMOS and it is an alternative technology to maintain Moore's law. Pre-planned zone clocking schemes are proposed in order to facilitate a design of a QCA circuit. In a QCA circuit designed with a pre-planned zone clocking scheme, data flows are predetermined; it leads to an increase of a circuit area. To solve this problem, this paper proposes a new approach to fnd an efficient data flow for a circuit. Experimental results show the usefulness of the proposed method.

R4-3 (Time: 14:14 - 14:16)
 Title An Error Diagnosis Technique Using ZDD to Extract Error Location Sets Author *Hiroshi Nakano, Shogo Ohmura, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan) Page pp. 262 - 267 Keyword error diagnosis, ZDD, ECO Abstract This paper presents an error diagnosis technique using ZDD (zero-suppressed binary decision diagram) to extract error location sets. A ZDD represents error location sets implicitly, which reduces processing time to extract them. Experimental results have shown that the proposed technique reduces the processing time by 92.4% in average, and the proposed variable ordering technique is effective to reduce ZDD node counts by 86.5% for large circuits.

R4-4 (Time: 14:16 - 14:18)
 Title Performance Improvements for Block-Flushing Author *Bao Yifang (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Bing Li (Technical University of Munich, Germany), Tsung-Yi Ho (National Tsing Hua University, Taiwan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 268 - 269 Keyword Block-Flushing, Path Changing, PMD Abstract During execution of the execution of multiple bioassays, some areas on Programmable Microfluidic Devices (PMDs) become contaminated and must be cleaned by washing them with a buffer flow before they are reused. There have been proposed an efficient method for washing called “Block-Flushing.” We show that Block-flushing can make the cleaning work complicated in specific cases and then we propose an improvement of Block-Flushing to alleviate the situation by adjusting flushing paths carefully.

R4-5 (Time: 14:18 - 14:20)
 Title A Proposal of Application Specific Approach with RISC-V Processor on FPGA Author *Tetsuo Miyauchi, Kiyofumi Tanaka (Japan Advanced Institute of Science and Technology, Japan) Page pp. 270 - 273 Keyword RISC-V, FPGA, Processor, Adapting Abstract Currently, the number of IoT(Internet of Things) devices is increasing. In IoT devices, small footprint is desirable. RISC-V is an open processor architecture, which is becoming popular for IoT devices. We implemented RISC-V soft processor core, of which instruction set is RV32IM (base implementation and multiple/division in 32 bit registers), on an FPGA with 5-stage pipeline. In this paper, we propose a method for reducing hardware resources by adapting the processor core to an application program. We show our approach can reduce necessary FPGA resources to 14.8% (Rijndael) and 14.4% (Matrix) of the full processor core implementation.

R4-6 (Time: 14:20 - 14:22)
 Title A Study on the Optimization of Asynchronous Circuits During RTL Conversion from Synchronous Circuits Author *Shogo Semba, Hiroshi Saito (The University of Aizu, Japan) Page pp. 274 - 279 Keyword asynchronous circuits, RTL design, optimization Abstract In this paper, we propose three optimization methods for asynchronous circuits during the Register Transfer Level (RTL) conversion from synchronous RTL models. The modularization of datapath resources and the restriction of the use of D flip-flops reduce the circuit area while fixing the control signal of the multiplexers reduces the dynamic power consumption. In the experiment, we evaluated the effect of the three optimization methods. The combination of the three optimization methods could reduce the energy consumption 24.6% in the case of a differential equation solver and 12.6% in the case of a tiny encryption algorithm compared to the ones without the proposed optimization methods.

R4-7 (Time: 14:22 - 14:24)
 Title Effect of Reducing the Bit Length of LFSRs for SC Author *Yudai Sakamoto, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan) Page pp. 280 - 285 Keyword Stochastic Computing, Stochastic Number, linear-feedback shift register (LFSR) Abstract Stochastic Computing (SC) is an approximation method to calculate functions by using Stochastic Numbers (SNs) which are generated by a linear-feedback shift register (LFSR) and a comparator in general. In this paper, we propose a method to reduce the bit length of LFSRs, and then we verify the errors of the proposed method. We provide some experimental results by which we can confirm that our proposed scheme is very useful.

R4-8 (Time: 14:24 - 14:26)
 Title Design of Asynchronous Circuits on Commercial FPGAs Using Placement Constraints Author *Tatsuki Otake, Hiroshi Saito (The University of Aizu, Japan) Page pp. 286 - 291 Keyword asynchronous circuits, FPGA, placement constraints Abstract In this paper, we propose a design method to implement asynchronous circuits with bundled-data implementation on commercial Field Programmable Gate Arrays (FPGAs) using placement constraints. Using the proposed method, we can obtain the asynchronous circuits whose performance is close to and the energy consumption is smaller (21.3% reduction on average) than the synchronous counterpart with a fewer delay adjustment.

R4-9 (Time: 14:26 - 14:28)
 Title Parallelizing SAT-based Coverage-Driven Design Verification Author *Kiyoharu Hamaguchi (Shimane University, Japan) Page pp. 292 - 295 Keyword design verification, SAT solver, coverage-driven verification, automated testbench Abstract We show results on parallelization of automated coverage-driven verification. In our prior work, we have shown an approach which combines random simulation with input pattern generation using a SAT solver. Experimental results show that the parallelization is promising for achieving higher coverage.

R4-10 (Time: 14:28 - 14:30)
 Title Quantitative Performance Comparison of Asynchronous and Synchronous Comparators Author *Kyota Akimoto, Toshiki Kanamoto, Atsushi Kurokawa, Masashi Imai (Hirosaki University, Japan) Page pp. 296 - 297 Keyword asynchronous circuit, comparator, bundled-data, average performance, hardware merge sorter Abstract Asynchronous circuits which can achieve average performance thanks to request-and-acknowledge handshaking protocols have a great potential to improve speed performance compared to synchronous circuits. In this paper, we propose a performance efficient circuit structure of a comparator for hardware merge sorters. We evaluate the proposed circuit and its counterpart synchronous circuit using 130nm process technology. As a result, the proposed asynchronous comparator can achieve higher performance than synchronous circuits according to input data.

R4-11 (Time: 14:30 - 14:32)
 Title Wire Load Model for Rapid Power Consumption Evaluation in Early Design Stage of Via-Switch FPGA Author *Asuka Natsuhara, Takashi Imagawa, Hiroyuki Ochi (Ritsumeikan University, Japan) Page pp. 298 - 303 Keyword atom switch, reconfigurable architecture, power estimation Abstract This paper proposes a wire load model for via-switch FPGA to allow simulation-based power estimation before routing. Via-switch FPGA is expected to achieve a dramatic improvement in the area, delay, and power compared with conventional SRAM-based FPGA. To estimate the power consumption of an application circuit mapped on a via-switch FPGA, time-consuming routing process was needed before circuit simulation. Using the proposed post-placement simulation ﬂow, runtime for power estimation is reduced by 63.8% on average compared with the conventional post-routing simulation ﬂow, with 11.8% degradation of estimation error on average.

R4-12 (Time: 14:32 - 14:34)
 Title Clock Tree Modification for Circuits with Programmable Delay Elements Author *Kota Muroi, Yukihide Kohira (The University of Aizu, Japan) Page pp. 304 - 309 Keyword post-silicon delay tuning, programmable delay element, clock tree modification Abstract In this paper, a clock tree modification method for circuits with programmable delay elements (PDEs) is proposed. Since the clock tree is designed without taking PSDT into consideration in existing methods, it may not be suitable for post-silicon delay tuning (PSDT). Our proposed method modifies the clock tree to improve yield in PSDT. Moreover, we propose a design flow for circuits with PDEs so that the design time be shortened and it can be applied to large circuits.

R4-13 (Time: 14:34 - 14:36)
 Title A Study on Updating Spins in Ising Model to Solve Combinatorial Optimization Problems Author *Yuki Naito, Kunihiro Fujiyoshi (Tokyo University of Agriculture and Technology, Japan) Page pp. 310 - 315 Keyword Ising model, Ising computer, combinatorial optimization problem, traveling salesman problem Abstract Ising model, which consists of spins and interactions of them, is a novel way to solve combinatorial optimization problems, for example, LSI layout problem. The problem is solved by updating the spins stochastically after being mapped to the model. Spins can be updated simultaneously on hardware. However, the problems aren’t solved fast since two spins with interaction should not be updated simultaneously. In this paper, we give a guideline of updating the spins simultaneously to execute high-speed search and confirm it through experiments.

R4-14 (Time: 14:36 - 14:38)
 Title A Fast Hotspot Detector Based on Local Features Using Concentric Circle Area Sampling Author *Hidekazu Takahashi, Shimpei Sato, Atsushi Takahashi (Department of Information and Communications Engineering, Tokyo Institute of Technology, Japan) Page pp. 316 - 321 Keyword Design for Manufacturability, Lithography Hotspot Detection, Machine Learning Abstract With the development of technology nodes, a defective circuit pattern has occurred on a chip. Areas, which may cause defects such as opens/shorts, are called hotspots. In this paper, we propose the hotspot detector based on the probability distribution of feature vectors. Experimental results show that our proposed method achieves 98% accuracy while False Positive Rate is less than 1%, and its computation is 8 times faster than conventional machine learning based methods on ICCAD2012 benchmark suite.

R4-15 (Time: 14:38 - 14:40)

R4-16 (Time: 14:40 - 14:42)
 Title A Tuning-Free Reservoir of MOSFET Crossbar Array for Inexpensive Hardware Realization of Echo State Network Author *Yuki Kume, Masayuki Hiromoto, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan) Page pp. 324 - 329 Keyword recurrent neural network, echo state network, reservoir computing, hardware implementation, weight tuning Abstract Echo state network (ESN) is a class of recurrent neural network, which drastically reduces training time by the use of a reservoir, a random and fixed network as the input and middle layers. In this paper, we propose a hardware implementation of ESN that uses inexpensive MOSFET-based reservoir. As opposed to existing reservoirs that require post tuning of weights for stability improvement, our ESN requires no post parameter tuning. For that purpose, we extend the circular law of random matrix for the sparse reservoirs so that a fixed feedback gain can be determined. Through the evaluations using Mackey-Glass time-series dataset, the proposed ESN achieved stable and successful inference without post parameter tuning.

R4-17 (Time: 14:42 - 14:44)
 Title Estimation of NBTI-Induced Timing Degradation Considering Duty Ratio Author *Kunihiro Oshima, Song Bian, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan) Page pp. 330 - 335 Keyword Negative bias temperature instability, timing degradation sensor, critical path, duty ratio Abstract We propose a novel estimation method for NBTI-induced timing degradation that takes the duty ratios of the input signals into account. In the proposed method, the signal propagation delay is evaluated with the proposed replica sensor circuit. With evaluations of the threshold voltage degradation model, delays of critical path candidates are estimated. The simulation results show that the proposed method can reduce the estimation error of critical path delay by 63 % compared to the delay estimation without duty consideration.

R4-18 (Time: 14:44 - 14:46)
 Title Polygon Fracture Method Considering Maximum Shot Size for Variable Shaped-Beam Mask Writing Author Mitsuru Hasegawa, *Kunihiro Fujiyoshi (Tokyo University of Agriculture and Technology, Japan) Page pp. 336 - 340 Keyword polygon fracturing, EB writing, mask data, dynamic programming Abstract Since variable shaped-beam mask writing machines for LSI mask production can expose a rectangle shaped-beam, we need to partition rectilinear polygons in layout into a set of rectangles of the minimum number with considering size limit. In this paper, we propose a new fracturing method for convex rectilinear polygons using dynamic programming, which cuts each polygon by slice-lines through concave vertices firstly. The proposed method can solve the problem in polynomial time. Computer experiments confirm the space and time complexity of the method.

Invited Talk III
Time: 15:40 - 16:30 Tuesday, October 22, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

I3-1 (Time: 15:40 - 16:30)
 Title (Invited Talk) Design and Demonstration of Superconducting Single Flux Quantum Circuits Operating around 50 GHz Author *Akira Fujimaki (Nagoya University, Japan) Page pp. 341 - 342 Keyword Superconductor Abstract We have been developing very high-speed digital processors including microprocessors based on the superconducting single flux quantum (SFQ) circuit. So far, we have successfully executed programs stored in an embedded memory at 50 GHz in a bit-serial microprocessor and demonstrated an 8-bit-parallel arithmetic logic unit at 50 GHz. These SFQ circuits show extremely high energy-efficiency and high performance compared to semiconductor circuits even if the cooling penalty for superconducting circuits is considered. The SFQ circuit is classified into the pulse logic, in which binary signals ‘1’ and ‘0’ are defined as the presence and absence of a signal impulse between two consecutive clock signals, respectively. The pulse logic is released from the processes of charge-up and discharge for capacitors or inductors, which leads to the features of high-speed operation and low power consumption. The impulses of the SFQ circuits referred to as the SFQ pulse have typical pulse width of 4 ps and pulse height of sub mV. The pulses corresponding to signals and clocks can propagate along superconducting passive transmission lines (PTLs) of the strip line/micro strip line structures at the speed of light with very small distortion, while transmitters and receivers made up of a few Josephson junctions are needed. Special care is required to be paid for designing SFQ circuits, because all the logic gates need clock signals. The setup- and hold-times for logic gates are 4 ps at most, and the accumulated time jitter of signals traveling in very long transmission lines reaches 1 ps. This means that the effective time window of signals to the two consecutive clock signals becomes 10 ps for 50-GHz-operation. Considering the parallel data lines, the timing of the signals including clocks arriving at the logic gates are required to be controlled in a pico-second order. We have been building the top-down design method for SFQ large-scale integrated circuits based on the cell library, in which the bias-voltage-dependent timing parameters such as delay, setup time, hold time are registered for all the logic gates and interconnects.