(Go to Top Page)

The 22nd Workshop on Synthesis And System Integration of Mixed Information Technologies
Technical Program

Remark: The presenter of each paper is marked with "*".
Technical Program:   SIMPLE version   DETAILED version with abstract
Author Index:   HERE

Session Schedule

Monday, October 21, 2019

8:30 -
9:00 - 9:20
K1  Keynote Speech I
9:20 - 10:20
R1  Regular Poster Session I
10:20 - 11:50
11:50 - 13:20
I1  Invited Talk I
13:20 - 14:10
R2  Regular Poster Session II
14:10 - 15:40
D  Panel Discussion
15:40 - 17:10
18:00 - 20:00
Tuesday, October 22, 2019

K2  Keynote Speech II
9:20 - 10:20
R3  Regular Poster Session III
10:20 - 11:50
11:50 - 13:20
I2  Invited Talk II
13:20 - 14:10
R4  Regular Poster Session IV
14:10 - 15:40
I3  Invited Talk III
15:40 - 16:30
16:30 - 16:40

List of papers

Remark: The presenter of each paper is marked with "*".

Monday, October 21, 2019

[To Session Table]

Keynote Speech I
Time: 9:20 - 10:20 Monday, October 21, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

K1-1 (Time: 9:20 - 10:20)
Title(Keynote Speech) Microfluidics Meets Microbiology: The Journey of Digital Microfluidic Biochips from Laboratory Research to Commercialization and Beyond
Author*Krishnendu Chakrabarty (Duke University, USA)
Pagep. 1
AbstractDigital microfluidics was transitioned to the marketplace for sample preparation by Illumina a few years ago. Since then, this technology has also been deployed by Genmark for infectious disease testing and Baebies for the detection of lysosomal enzymes in newborns. This lecture will describe the journey from early laboratory research, PhD theses and publication of research articles, to technology transfer and licensing to companies. Despite these success stories, there still remains a significant gap between microfluidics research and its adoption in microbiology. The presenter will describe how this gap can potentially be closed through new directions in digital microfluidics, including recent advances in micro-electrode-dot arrays, acoustofluidics, and countermeasures against malicious attacks on biomolecular protocols.

[To Session Table]

Regular Poster Session I
Time: 10:20 - 11:50 Monday, October 21, 2019
Chairs: Masashi Imai (Hirosaki University, Japan), Rung-Bin Lin (Yuan Ze University, Taiwan)

Best Paper Award
R1-1 (Time: 10:20 - 10:22)
TitleEnergy-efficient ECG Signals Outlier Detection Hardware using a Sparse Robust Deep Autoencoder
Author*Naoto Soga, Shimpei Sato, Hiroki Nakahara (Tokyo Institute of Technology, Japan)
Pagepp. 2 - 7
KeywordOutlier Detection, autoencoder, a sparse network, FPGA, ECG
AbstractIn recent years, portable electrocardiographs have begun to spread, which enable us to record electrocardiogram (ECG) signals in everyday life. A portable ECG analysis device is needed so that abnormal ECG waves can be detected anywhere. Machine learning techniques, including deep learning, are used in a lot of research to analyze ECG signals since they show more superb performance than conventional methods. However, deep learning models often have too many parameters to implement on mobile hardware. In this research, we propose a method to implement an ECG outlier detector using deep learning techniques in a small builtin device. As a way of detecting outliers, an autoencoder, which is based on neural networks, was used. A sparseness technique was applied to the autoencoder, and the trained autoencoder was implemented on a low-end FPGA. Compared with ARM Cortex M3 embedded processor, the proposed hardware result in 159 times better for energy-efficiency improvement.

R1-2 (Time: 10:22 - 10:24)
TitleA Design Space Exploration Method of SoC Architecture for CNN-based AI Platform
Author*Salita Sombatsiri (Osaka University, NEC Corporation, Japan), Jaehoon Yu, Masanori Hashimoto (Osaka University, Japan), Yoshinori Takeuchi (Kindai University, Japan)
Pagepp. 8 - 13
KeywordDesign space exploration, System-on-a-chip, CNN, multi-layer bus
AbstractThis paper proposes a design space exploration (DSE) method for CNN-based AI platform to find SoC architectures that optimally parallelize massive data computation and data transfer. First, the proposed DSE explores both functional blocks, which undertake a process execution, and their parameters, i.e. the number of instances and PEs, to parallelize CNN's intensive intra-process computation with the ease of system modeling and exploration. Second, a multi-layer bus architecture and configuration are optimized to parallelize data transfer by performing master-slave clustering with three-step channel mapping. Experimental result shows that the proposed DSE with pruning technique found 17 Pareto-optimal architectures from the design space of 2 million architectures within 11.5 hours, which is 21% time reduction compared to the exhaustive exploration.

R1-3 (Time: 10:24 - 10:26)
TitleReconfigurable Activation Functions for Neural Networks Application
AuthorYu-Jung Huang (I-Shou University, Taiwan), Meng-Jhe Li, *Wun-Siou Jhong, Shao-I Chu (National Kaohsiung University of Science and Technology, Taiwan)
Pagepp. 14 - 17
KeywordFPGA, activation function, neural networks
AbstractField programmable gate arrays (FPGAs) have recently become popular for accelerating the deep learning networks due to their parallel processing and reconfigurable capabilities as well as their energy efficiency. This paper presents a multi-layer neural network architecture with novel reconfigurable activation functions by utilizing the coordinate rotation digital computer (CORDIC) technique and applying the floating-point format (IEEE 754 standard in single precision). The functionality was successfully verified in hardware using a DE2-115 board that included an Altera Cyclone® IV FPGA.

R1-4 (Time: 10:26 - 10:28)
TitleMinimization of Energy Consumption of Double Modular Redundancy Design of Conditional Processing by Common Condition Dependency
Author*Kazuhito Ito (Saitama University, Japan)
Pagepp. 18 - 23
KeywordDouble modular redundancy, soft error, conditional processing, energy minimization
AbstractDouble modular redundancy (DMR) is to execute an operation twice and detect soft error by comparing the operation results. The error is corrected by executing necessary operations again. The DMR design for conditional processing is considered in this work. A method is proposed which makes the secondary executions of the duplicated operations be dependent on the primary execution of the condition operation, thereby widening the schedule solution space and allowing better results to be derived. The minimization of energy consumption with the proposed method is formulated as ILP models and the optimum solution is obtained by using an ILP solver.

R1-5 (Time: 10:28 - 10:30)
TitleApplication of Overlap-Add FFT Algorithm for Computation Reduction of Convolution Neural Networks
AuthorHsia-Tsung Wang, *Wei-Kai Cheng (Chung Yuan Christian University, Taiwan)
Pagepp. 24 - 26
KeywordCNN, FFT
AbstractAs the computation demand of CNNs is dominated by convolution layers, some researches exploit the duality between spatial domain and frequency domain through fast Fourier transform (FFT) to replace convolutions with pointwise multiplications. However, the FFT approach requires zero padding to enlarge the filter kernel to be the same size of input feature map. In this paper, we apply the overlap-add FFT algorithm to resolve the large zero padding problem in full FFT model. Our approach can fit all filter kernel size, and especially benefit small filter kernel size like 3x3. Experiments on ResNet-34 shows that in average, our overlap-add FFT scheme achieves near to 41% of convolution complexity, and can further reduced to 10% of complexity with circuit optimization.

R1-6 (Time: 10:30 - 10:32)
TitleImproving Global Motion Compensation for Frame Interpolation with High-Resolution and High-Frame-Rate Video
Author*Keita Ukihashi, Takashi Imagawa (Ritsumeikan University, Japan), Hiroshi Tsutsui, Yoshikazu Miyanaga (Hokkaido University, Japan), Hiroyuki Ochi (Ritsumeikan University, Japan)
Pagepp. 27 - 32
Keywordframe interpolation, motion compensation
AbstractIn this paper, we propose a novel global motion compensation method to be used in frame interpolation from input video that consists of high-resolution less-frequent frames (keyframes) and low-resolution high-frame-rate (LR-HF) frames. To generate better-interpolated background from two keyframes using homography transformation, we improve the accuracy of global motion estimaion by eliminating and interpolating feature point (FP) and by detecting erroneous homography matrix. We also introduce an adaptive weight model for superimposing transformed keyframes. The experimental results show that the proposed method achieves interpolated frames with better quality than the conventional one.

R1-7 (Time: 10:32 - 10:34)
TitleConfigurable Processor Hardware Developing Environment for RISC-V with Vector Extension
Author*Ryo Taketani (Department of Information Systems Engineering, Osaka University, Japan), Yoshinori Takeuchi (Department of Electric and Electronic Engineering, Kindai University, Japan)
Pagepp. 33 - 38
KeywordConfigurable processor, RISC-V, Vector architecture
AbstractThis research proposes a configurable processor hardware developing environment for RISC-V with vector extension. RISC-V is getting more attention as an open Instruction Set Architecture. RISC-V has vector extension specified for parallel computing takes power savings and high executed cycle performance into consideration. We challenged to implement a RISC-V based hardware processor with vector extension and evaluated it. 

R1-8 (Time: 10:34 - 10:36)
TitleImproved Multiplier Architecture on ASIC for RLWE-based Key Exchange
Author*Tatsuki Ono, Song Bian, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan)
Pagepp. 39 - 40
Keywordring learning with errors, application specific integrated circuit, cryptography, key exchange, multiplier
AbstractThe ring learning with errors (RLWE) problem is one of the most promising candidates for constructing quantum-resistant cryptosystems. In this work, we implement an improved hardware multiplier unit for RLWE key exchange schemes. By reducing internal processing units and shortening processing steps, circuit area, power, and latency are reduced to 0.63x, 0.48x, and 0.86x, respectively, compared to the conventional architecture.

R1-9 (Time: 10:36 - 10:38)
TitleParameter Embedding for Efficient FPGA Implementation of Binarized Neural Networks
Author*Reina Sugimoto, Nagisa Ishiura (Kwansei Gakuin University, Japan)
Pagepp. 41 - 45
Keywordbinarized neural network, FPGA implementation, parameter embedding
AbstractA binarized neural network (BNN), a restricted type of neural network where weights and activations are binary, enables compact hardware implementation. While the existing architectures for BNN assume that weights and biases are stored in on-chip RAMs, this paper presents an attempt to embed those parameters into processing elements by utilizing LUTs in FPGAs as ROMs. This eliminates the bandwidth limitation between memories and neuron PEs and allows higher parallelism, as well as it reduces the hardware cost of the neuron PEs. This paper also proposes a map-shift scheme to efficiently supply the neuron PEs with feature map data for convolution. As a case study, LeNet5 has been implemented based on this method targeting Xilinx FPGA Artix-7, which can process a frame in 1,386 cycles at 21.1MHz.

R1-10 (Time: 10:38 - 10:40)
TitleA 4CH CNN Hardware Architecture for Image Super-Resolution
Author*Koyo Suzuki, Kazuki Mori, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan)
Pagepp. 46 - 50
KeywordSuper-Resolution, CNN
AbstractThis paper presents two hardware architectures for super-resolution technology with 4CH CNN (convolutional neural network with four output-channels). We introduce time-division processing to save resources. Moreover, we propose a technique to save resources by sharing some part of the circuit in one architecture. Experimental results have shown that the architecture reduces resources by about 4 to 21 pt. compared to the other architecture. Both architectures speed up about 5.5 times as fast as software processing.

R1-11 (Time: 10:40 - 10:42)
TitleApproximate Function Configuration by Neural Network on Memory-array Unit
Author*Xuechen Zang, Shigetoshi Nakatake (The University of Kitakyushu, Japan), Hiroyuki Kozutsumi, Mitsunori Katsu (TRL Corp., Japan), Shoichi Sekiguchi (TAIYO YUDEN Co., LTD, Japan)
Pagepp. 51 - 55
KeywordApproximate Computing, Reconfigurable Systems, MRLD, Approximate Logic
AbstractThis paper presents approximate computing consistent with a memory-based reconfigurable logic device (MRLD). We propose a novel implementation flow how to realize a function of multiple look up table (MLUT) by employing neural network (NN) based machine learning. Like a function fitting, our method implement a logic function induced by a set of input and output. To verify the performance of approximate computing implementation, we compare a general polynomial regression method and a deep neural networks. The results suggest relatively a deeper NN is superior on loss value and accuracy rate. The NN models achieve lower symbol error rate (SER) and get considerable loss reduction respectively compared to the polynomial regression. Besides, we demonstrate how to use such models for an 8-bit inverter logic example.

R1-12 (Time: 10:42 - 10:44)
TitleA Deep Neuro-Fuzzy for False Decision Prevention on an FPGA
Author*Masayuki Shimoda, Hiroki Nakahara (Tokyo Institute of Technology, Japan)
Pagepp. 56 - 61
KeywordDeep Neural Netwrok, Fuzzy Inference, FPGA
AbstractWe propose a deep neuro-fuzzy that consists of a deep neural network(DNN) and fuzzy inference. The fuzzy inference judges whether inputs are distinguishable or not from the DNN outputs to avoid critical errors(e.g., recognizing malignancy data as benign one). When our system detects a distinguishable data, it outputs indistinguishable. Experimental results shows that the recall increased by 20.52% in the best case and its area and computation time are almost the same compared with typical DNNs. Thus, our proposal is more suitable for embedded systems under the situations where the error is critical.

R1-13 (Time: 10:44 - 10:46)
TitleA Real Chip Evaluation of a CNN Accelerator SNACC
Author*Ryohei Tomura, Takuya Kojima, Hideharu Amano (Dept. of Information and Computer Science, Keio University, Japan), Ryuichi Sakamoto, Masaki Kondo (Graduate School of Information Science and Technology, The University of Tokyo, Japan)
Pagepp. 62 - 67
KeywordAccelerator, CNN
AbstractSNACC (Scalable Neuro Accelerator Core with Cubic integration) is an accelerator for deep neural network, which can improve the performance by increasing the number of stacked chips with inductive coupling wireless through chip interface (TCI). The chip implementation and real chip evaluation of SNACC are introduced. It consists of four processing element cores which executes dedicated SIMD instructions, distributed memory modules for storing weight data, and TCI. The real chip evaluation by using Lenesas Electronics’ 65nm SOTB (Silicon On Thin Box) CMOS technology appears that a simple CNN LeNet works at 50MHz for all layers with 0.90V supply voltage. The power consumption is less than 12mW. The performance can be enhanced by the forward body biasing about 15% in exchange for about 2mW leakage increasing. Also, SNACC archieved more than 20 times high performance to a MIPS R3000 compatible embedded processor.

R1-14 (Time: 10:46 - 10:48)
TitleIMU-based Rehabilitation System for Upper and Lower Limbs
AuthorChun-Jui Chen, Yi-Ting Lin, Chia-Chun Lin (Department of Computer Science, National Tsing Hua University, Taiwan), Yung-Chih Chen (Department of Computer Science and Engineering, Yuan Ze University, Taiwan), *Chun-Yao Wang (Department of Computer Science, National Tsing Hua University, Taiwan)
Pagepp. 68 - 73
KeywordRehabilitation, knee angle, elbow angle
AbstractIn this work, we present an IMU-based rehabilitation system for upper and lower limbs. This system uses two wearable IMU sensors to detect rehabilitation motions of patients suffering from frozen shoulder, knees, and hip surgeries. The sensors are also connected to a smartphone via Bluetooth, and an Android APP is designed to show the correctness and the statistics of the rehabilitation exercises. The experimental results show that the average errors of knee angle, and elbow angle are both less than 5°. The average recognition rates of all rehabilitation exercises are larger than 85%.

R1-15 (Time: 10:48 - 10:50)
TitleA Smart Single-Sensor Device for Instantaneously Monitoring Lower Limb Exercises
AuthorYan-Ping Chang, Teng-Chia Wang, Chun-Jui Chen, Chia-Chun Lin (National Tsing Hua University, Taiwan), *Yung-Chih Chen (Yuan Ze University, Taiwan), Chun-Yao Wang (National Tsing Hua University, Taiwan)
Pagepp. 74 - 79
Keywordstride count, walking distance, 9-axial sensor
AbstractStudies have shown that stair exercises can enhance the strength of lower limbs for patients with limb disorders. However, there are only few systems that can monitor the lower limb exercises in the medical institutes. To analyze the lower limb exercises instantaneously, we propose a smart single-sensor wearable device, S3-Sock, equipped on shoes. The sock can monitor and measure the stride count, step height, and the distance of step trajectory about lower limb exercises. The experimental results demonstrate that the proposed system is reliable under different lower limb exercises. The averages of absolute mean errors of stride count in stair-climbing and walking are about 2.00% and 0.88%, respectively. The averages of absolute mean errors of step height are about 5.12% and 8.23% in step-by-step and step-over-step stair climbing, respectively.

R1-16 (Time: 10:50 - 10:52)
Title1-D GDR Aware Cell Generation via P/N bi-partition
AuthorYao-Lin Chang, Hung-Ming Chen, *Wei-Tung Chao, Chien-Hung Lin (National Chiao Tung University, Taiwan)
Pagepp. 80 - 81
Keywordlayout, standard cell
AbstractAs the complexity of a layout design grows, layout generation problem has been more challenging. This work features the bi-partition tree and the selective stage. With this bi-partition tree, we speed up the layout generation flow and guarantee no additional wire length. With objective functions in the placement selection stage and the routing stage, a lithographyfriendly layout with low congestion, minimum area and high performance is accomplished.

[To Session Table]

Invited Talk I
Time: 13:20 - 14:10 Monday, October 21, 2019
Chair: Shigeru Yamashita (Ritsumeikan University, Japan)

I1-1 (Time: 13:20 - 14:10)
Title(Invited Talk) LSI Design and Current Topics for Automotives
Author*Toshihiro Hattori (Renesas Electronics, Japan)
Pagep. 82
AbstractAutomotive is one of the major applications for the semiconductor devices. And the semiconductor devices are the key factors to support the current innovation of MOBILTY (automotive) systems. Firstly, I will explain the different needs, feature, and technology for automotive oriented LSI’s. As you know, Automotive technology is performing a drastic innovation leaded the key words “CASE (Connected, Autonomous, Shared & Services, Electric” and “MaaS (Mobility as a Service)”. I will overview the trends and needs for automotive LSI’s. Functional Safety and Security is the key technology required current automotive LSI’s. I will explain the trends and background of autonomous driving and show the example of the latest implementation for autonomous driving support LSI’s. I will show the background of the functional safety trends and the example of a 28nm automotive flash microcontroller for next-generation automotive architecture complying with ISO26262 ASIL-D. I will show the background of the security trends in automotive and the example of a 24MB embedded flash system based on 28nm SG-MONOS featuring robust over-the-air software update.

[To Session Table]

Regular Poster Session II
Time: 14:10 - 15:40 Monday, October 21, 2019
Chairs: Yu-Guang Chen (National Central University, Taiwan), Ching-Hwa Cheng (Feng Chia University, Taiwan)

Outstanding Paper Awards
R2-1 (Time: 14:10 - 14:12)
TitleInsertion Based Procedural Construction of Parallel Prefix Adders
Author*Bo-Yu Tseng, Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan)
Pagepp. 83 - 88
Keywordadder, optimization, binary tree
AbstractAs a novel approach to the design of parallel prefix adders, the framework of the procedural construction of parallel prefix adders has been proposed. This approach aims to configure the prefix tree structure by the sequence of basic structural operations. Among several basic operations, ``insertion'' has a potential to produce a variety of prefix structures while keeping the hardware cost low. This paper explores the essential structural variations achieved by insertion operation, and proposes a coding scheme which can represent all these essential variations with excluding redundancy as much as possible. In our approach, we focus on the sequence of insertion operations applied at various positions, and propose to use a binary tree to specify the order of applying insertion operations. Our discussions in this paper would be an important base for the optimization of parallel prefix adder, which is one of our future works.

R2-2 (Time: 14:12 - 14:14)
Title3D Test Wrapper Chain Synthesis for Test Time and TSV Count Co-optimization under Constraints on I/O Cells
AuthorFan-Hsuan Tang, Hsu-Yu Kao, *Shih-Hsu Huang (Chung Yuan Christian University, Taiwan)
Pagepp. 89 - 94
KeywordSoC Testing, Test Wrapper Chain Synthesis, Design for Testability, TSV Count Minimization, 3D ICs
AbstractIn addition to test time minimization, the number of testing TSVs is also an important concern for the 3D test wrapper chain synthesis problem. Previous co-optimization algorithms only can work under no constraints on I/O cells. In this paper, we propose a single-stage KL (Kernighan-Lin) based algorithm to overcome this drawback. Different from previous works, the proposed synthesis algorithm can take specified I/O cells constraints into account during co-optimization. Benchmark data consistently show that the proposed algorithm can greatly reduce both test time and TSV number.

R2-3 (Time: 14:14 - 14:16)
TitleA New Approach to Express Stochastic Numbers
Author*Yukino Watanabe, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 95 - 98
KeywordStochastic Computing, Stochastic Numbers
AbstractStochastic Computing (SC) is a technique to calculate complex functions with very small hardware overhead when we can allow some small errors. SC uses Stochastic Numbers (SNs) which are generally long (e.g., 1024) bit string; we need many cycles to calculate a function with SNs. In this paper, we propose a novel idea to reduce the length of SNs while the precision level of SNs is not changed. Our idea is to express one SN by using two bit-strings, and the two bit-strings has different weights. The multiplication of two SNs by our expression is not trivial. So we propose how to multiply two SNs by our new expressions. Then we show some experimental results to confirm that our proposed multiplication can provide almost similar error rate as the conventional SNs with significantly small length of bits.

R2-4 (Time: 14:16 - 14:18)
TitleRapid Single-Flux-Quantum Matrix Multiplication Circuit Utilizing Bit-Level Processing
Author*Nobutaka Kito, Takuya Kumagai (Chukyo University, Japan), Kazuyoshi Takagi (Mie University, Japan)
Pagepp. 99 - 103
Keywordmatrix multiplication, RSFQ circuits
AbstractA rapid single-flux-quantum (RSFQ) matrix multiplication circuit utilizing bit-level processing is presented. The proposed circuit utilizes characteristics of pulse logic used in RSFQ circuits and utilizes bit-level processing. The circuit carries out multiplications and additions by counting pulses on signal lines. It uses fewer gates compared with previously proposed parallel processing designs and could be realized in small layout area. A layout for 4-bit 4 x 4 matrix multiplication was designed and its correct operation was verified in simulation.

R2-5 (Time: 14:18 - 14:20)
TitleIrregular Bumps Design Planning for Modern Ball Grid Array Packages
AuthorHsin-Yu Chang, Jyun-Ru Jiang, Simon Chen, Hung-Ming Chen, *Ya-Ying Chien (National Chiao Tung University, Taiwan)
Pagepp. 104 - 109
Keywordflip-chip packages, routability
AbstractIn modern flip-chip packages, bumps are often placed irregularly due to different design needs. It costs a great amount of time and manual effort to generate substrate routing from bumps through vias to package balls. Moreover, any single model in prior works could not be simultaneously applied between bumps, vias and balls. In this work, we propose a hybrid flow network model to formulate the 2-layer substrate routing problem on irregular package structure. We present a new bump model that can handle irregular bump plans. With our methodology, signal assignment on vias and balls, and substrate routing on two layers can be obtained at the same time. We also present an iterative optimization technique to improve wire congestion. Our results show that the proposed method completes via and ball assignment efficiently, and obtain 100% routability and an average wirelength improvement of 16.45%, compared with manual design in real industrial cases.

R2-6 (Time: 14:20 - 14:22)
TitleDroplet Splitting Routing for Micro-Electrode-Dot-Array Digital Microfluidic Biochips
Author*Ikuru Yoshida, Kota Asai (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Tsung-Yi Ho (National Tsing Hua University, Japan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 110 - 115
Keywordbiochips, droplet routing, micro-electrode dot array
AbstractDigital micro fluidic biochips (DMFBs) is one of the most promising technologies to use for sample preparation. Among them, DMFBs based on micro-electrode dot array (MEDA) is the technology overcoming the drawback of a conventional DMFB. On MEDA based biochips, we can perform droplet shaping and splitting operations that cannot be performed on a conventional DMFB. In this paper, we propose an efficient droplet routing method by splitting droplets in MEDA when there are multiple spaces between block regions. We confirm by our experiment that our method indeed can reduce the necessary time steps for droplets to reach target regions.

R2-7 (Time: 14:22 - 14:24)
TitleExploring Time-space Trade-off for Application Mapping onto 3-D Torus NoCs
Author*Yao Hu, Michihiro Koibuchi (National Institute of Informatics, Japan)
Pagepp. 116 - 117
KeywordNetwork-on-Chip (NoC), topology embedding, interconnection network, job mapping
AbstractOne application usually has many parallel tasks running on multiple processing cores which communicate with each other on a many-core chip. Traditionally, the tasks are mapped onto a regular topology of network-on-chip (NoC) with nearby processing cores to reduce the network distances. In this case, fragmentation of unused processing cores may occur when receiving a new incoming application on a chip. In this study, we assume that each application has to be executed on a pre-fixed network topology on a many-core chip with 3-D torus NoC. To improve the system utilization, i.e. reducing a number of unused processing cores, we allow to use non-adjacent processing cores for an application mapping, which form a pre-fixed network topology. We evaluate the time-space trade-off during node allocation with different mapping dilations for the purpose of improving job scheduling abilities. Evaluation results show that, for a large compound workload of NAS Parallel Benchmarks (NPB) applications, the proposed mapping can reduce up to 6% of turnaround time when compared with the regular topology mapping on a large 3-D torus NoC.

R2-8 (Time: 14:24 - 14:26)
TitleOn Power Supply Pads Planning for Wire-bonded IC
AuthorHui Zhong Leong, *Ming-Yu Huang, Hung-Ming Chen (NCTU Taiwan, Taiwan), Chang-Tzu Lin (ITRI Taiwan, Taiwan)
Pagepp. 118 - 121
Keywordpower supply, pdn, wire-bonded ic
AbstractIn wire-bonding technology, Input/Output (I/O) pads are located along the peripheral of integrated circuit (IC) and power pad placement is limited by available I/O pad candidates. Power pads supply voltage to the IC through power delivery network (PDN), hence insufficient power pads may cause IC failure. To overcome this problem, we propose a power pad placement algorithm for wire-bonding technology. Experimental results show that the proposed algorithm determines both power pad counts and power pad locations effectively for a given power delivery network. In addition, the worst voltage drop for the IC is guaranteed to be less than 3% of the supply voltage.

R2-9 (Time: 14:26 - 14:28)
TitleSample Preparation with Efficient Dilution of Biochemical Fluids using Programmable Microfluidic Devices
Author*Ying Shuaijie (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Sudip Roy (Indian Institute of Technology (IIT) Roorkee, India), Juinn-Dar Huang (National Chiao Tung University, Taiwan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 122 - 125
KeywordPMD, Sample preparation, two steps, small area
AbstractSample preparation, which is a front-end process to produce the desired target concentrations of the input reagent fluid, plays a pivotal role in every bioassay or biochemical laboratory protocol. In this paper, we propose two sample preparation algorithms for efficient dilution of biochemical fluids using programmable microfluidic devices (PMDs). The first method is called as dilution algorithm in two steps (DATS), which needs only two diluting operations. Whereas, the other method is called as dilution algorithm in a small dilution area (DASDA), which needs less area compared to that by DATS.

R2-10 (Time: 14:28 - 14:30)
TitleAn Efficient Character Generation Algorithm for High-Throughput E-Beam Lithography
Author*Shih-Ting Lin, Hong-Yan Su (National Chiao Tung University, Taiwan), Oscar Chen (AnaGlobe Technology, Inc, Taiwan), Yih-Lang Li (National Chiao Tung University, Taiwan)
Pagepp. 126 - 131
KeywordCharacter projection E-beam lithography, exact pattern matching, frequently used character, multi-intersection-level layout
AbstractE-beam lithography has been one of promising next generation lithography for 7nm and below technology nodes. Among vari-ous electron-beam lithography features, character projection (CP) attracts users because complex patterns can be printed in one e-beam shot. However, we still face severe challenges of gen-erating characters on interconnection layers due to its pattern diversity. In this paper, we proposes a multi-intersection-level (MIL) layout that can efficiently capture the relationships be-tween nearby objects including the spacing between them. The inflated layer reduces the problem instance size for identifying the frequently used patterns while the intersection layers help in clipping windows to obtain ideal character set. Experimental results show that the proposed methodology can efficiently yield the frequently used character set with up to 93.3% and 81.23% covering rate in via layer and metal layer. Besides, for a panel layout, a set of frequently used characters to reach 100% cov-ering rate is successfully identified.

R2-11 (Time: 14:30 - 14:32)
TitleColor Balancing-aware Non-Stitch Routing for Multiple Patterning Lithography
Author*Jia-Hong Chang, Shao-Yun Fang (National Taiwan University of Science and Technology, Taiwan)
Pagepp. 132 - 135
KeywordMultiple Patterning Lithography, Color Balancing, Routing
AbstractMultiple Patterning Lithography (MPL) is one of the major resolution enhancement technologies for sub-20 nm nodes, which requires to decompose a layout into multiple masks considering the minimum mask spacing rule. In this paper, we propose an MPL-aware routing algorithm considering mask usage balancing to optimize pattern printability. Different from previous works, stitch insertion is not considered in our router since stitches are usually forbidden in industry to guarantee sufficient yield. To maximize the flexibility in mask usage optimization that is deficient for non-stitch routing, a multiple-objective minimum spanning tree algorithm (MO-MST) is proposed to make the distribution of generated wire segments more scattered. An integer linear programming (ILP)-based color refinement approach is also proposed to optimize mask usage balancing. Experimental results show that the proposed algorithm flow can generate MPL-compliant routing solutions with excellent mask usage balancing for the benchmarks released by 2018 CAD Contest at ICCAD.

R2-12 (Time: 14:32 - 14:34)
TitleAn Efficient and Effective Macro Placement Algorithm for Large-Scale Mixed-Size Designs
AuthorJai-Ming Lin, You-Lun Deng, Ya-Chu Yang, *Jia-Jian Chen (Department of Electrical Engineering, National Cheng Kung University, Taiwan)
Pagepp. 136 - 137
Keywordmacro placement, simulated evolution, physical design, design hierarchy, mixed-size
AbstractWe propose a novel approach which integrates the simulated evolution algorithm and corner stitching data structure. Unlike the simulated annealing algorithm which existing works adopt, our approach prevents a solution from getting stuck at a local optimal solution but takes smaller runtime. Even though a chip contains several preplaced macros and may not abutted to chip boundaries, our approach is able to be handled these situations. Experimental results show that our approach obtains better results in wirelength, routability, and runtime.

R2-13 (Time: 14:34 - 14:36)
TitleThermal Modeling and Simulation of a Smart Wrist-worn Wearable Device
Author*Kodai Matsuhashi (Hirosaki University, Japan), Koutaro Hachiya (Teikyo Heisei University, Japan), Toshiki Kanamoto, Masasi Imai, Atsushi Kurokawa (Hirosaki University, Japan)
Pagepp. 138 - 143
Keywordwearable device, thermal design, smart watch
AbstractWe propose a thermal-circuit model that can calculate temperatures in important places for thermal designs of smart wrist-worn wearable devices. The thermal model can be applied to various wrist-worn wearable devices, which consist of different device-body shapes, belt sizes, and materials. The temperatures obtained using the proposed model agree well with those obtained by a commercial thermal solver. Moreover, by simulations applying the model, we present important knowledge for thermal designs of wrist-worn wearable devices.

R2-14 (Time: 14:36 - 14:38)
TitleMixing of Biochemical Fluids using Programmable Microfluidic Devices
Author*Yuto Umeda (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Sudip Roy (Indian Institute of Technology (IIT) Roorkee, India), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 144 - 149
Keywordprogrammable microfluidic device, the number of mixing operations, assigning reagents
AbstractA programmable microfluidic device (PMD) can mix the reagents in various ratios. In this paper, we propose a mixing method to reduce the number of mixing operations on PMDs. Our method finds the best assignment of each reagent to each mixing operation so that we can reduce the number of mixing operations by simplifying the ratio of reagents and reusing intermediate waste reagent. Experimental results show that our proposed method can make mixing trees with the smallest number of mixing operations.

R2-15 (Time: 14:38 - 14:40)
TitleGeneralized Via Pattern Awareness Substrate Routing Framework for Fine Pitch Ball Grid Array
AuthorJun-Sheng Wu, Chi-An Pan, *Yi-Yu Liu (National Taiwan University of Science and Technology, Taiwan)
Pagepp. 150 - 151
KeywordRouting, ILP
AbstractPackaging substrate has become one of the most important carriers to enable system-level and heterogeneous design within a small footprint size. Instead of applying advanced semiconductor interposer process technologies, the fine pitch ball grid array (FBGA) package substrates are manufactured by mechanical processes. To tackle stringent design rules owing to the mismatched via dimension and miscellaneous routing obstacles, substrate interconnect designs are usually customized by experienced substrate layout engineers. However, fully net-by-net manual design for hundred-scale FBGA is time consuming and error-prone. In this paper, we model the FBGA substrate routing as an integer linear programming (ILP) problem taking various via patterns and design-dependent constraints into account. Two-stage early exit methodology and ILP constraint reduction techniques are developed to boost the runtime of ILP solver. Experimental results indicate the potential of the proposed framework. We argue that complex FBGA designs could be semi-automated by using via pattern candidates to reduce the substrate layout design cycle time.

R2-16 (Time: 14:40 - 14:42)
TitleAcceleration of Radix-Heap based Dijkstra algorithm by Lazy Update
AuthorTomohiro Takahashi (University of Kitakyshu, Japan), *Yasuhiro Takashima (University of Kitakyushu, Japan)
Pagepp. 152 - 157
KeywordDijkstra's algorithm, Lazy update, Radix-heap
AbstractThis paper proposes a fast Dijkstra algorithm with radix-heap by lazy update which solves the single source shortest path problem (SSSP). The conventional Dijkstra algorithm chooses one vertex with the minimum tentative distance among the unvisited vertices. For the problem, the relaxation of the number of selected vertices not only one but also multiple under the guarantee of its optimality has been proposed, called lazy update. In this paper, we utilize this lazy update method to the radix-heap based Dijkstra which solves SSSP with the integer edge distances. The experimental results confirm the efficiency of the proposed method which execute 50 % faster than the conventional Dijkstra.

R2-17 (Time: 14:42 - 14:44)
TitleA Global Placement Method for RECON Spare Cells in ECO-Friendly Design Style
Author*Junpei Akashi, Suguru Hojo, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan)
Pagepp. 158 - 163
KeywordECO, reconfigurable cell, error diagnosis, technology remapping
AbstractThis paper presents an approach to obtain suitable global placement of RECON spare cells in the ECO (Engineering Change Order)-friendly design style based on the statistics with each subregion concerning critical and near-critical paths, occupancy of RECON embedded cells, and utilization of RECON cells. Experimental results have shown that the proposed method is effective to fix post-mask ECO’s suppressing increase in the maximum delay time compared with the conventional approach.

R2-18 (Time: 14:44 - 14:46)
TitleAn Efficient Thermal Model of Thin Film NiCr Resistors Considering Pulse Response
Author*Ryosuke Watanabe (Hirosaki University, Japan), Keita Izawa (Nikkohm Co., Ltd., Japan), Shota Kajiya, Daiki Tsunemoto, Koki Kasai, Atsushi Kurokawa, Toshiki Kanamoto (Hirosaki University, Japan)
Pagepp. 164 - 167
KeywordThin film resistors, Thermal circuit analysis
AbstractThis paper proposes an efficient thermal model of an industrial thin film NiCr resistors. We considered the thermal destruction effect of the thin film NiCr resistors for high pulsed power incident condition. The thin film NiCr resistors considered in this study have two types of thermal time constant. TCAD calculation indicates that the short thermal time constant around 55 $\mu$s exist in the resistors, and experimental results indicate that long thermal time constant around 40 seconds exist. Therefore, to analyses the thermal transient behaviors of the resistors more precisely, we propose the thermal circuit model that includes both the short and long thermal time constant. In the model, thermal resistance and heat capacitance of the thin NiCr sheet are precisely considered, and these parameters are quite important for the existence of short thermal time constant. Existence of the short thermal time constant in this model strongly related to the peak temperature of the considered resistors, and we think that the short time thermal response of the thin film NiCr resistors is related to the pulse durability of the resistors.

R2-19 (Time: 14:46 - 14:48)
TitleA Smart Knee Pad for Stride Count and Walking Distance Measurement via Knee Angle Calculation
AuthorTeng-Chia Wang, Yan-Ping Chang, Chun-Jui Chen, *Chia-Chun Lin (National Tsing Hua University, Taiwan), Yung-Chih Chen (Yuan Ze University, Taiwan), Chun-Yao Wang (National Tsing Hua University, Taiwan)
Pagepp. 168 - 173
Keywordknee angle, stride count, walking distance, 9-axial sensor
AbstractTo calculate the knee angle, stride counts, and walking distance, we propose a system, iKneePad, fusing two 9-axis sensors with Bluetooth equipped on the thigh and shank segments. The changing rates of hip and knee angles are used to determine the beginning and the ending of a stride. The thigh length, shank length, hip angle, and knee angle are used to calculate the walking distance. The experimental results show that the accuracy of stride count is 100%, the absolute mean errors of knee angle are 2.99 and 1.42 for the maximum and minimum flexion angles, respectively. For walking distance, the mean error rates are -2.40% and -2.26% for short (10m) and long (33m) distances, respectively. The proposed system also instantly provides feedback to users by showing on an Android smartphone when conducting rehabilitation or exercise with iKneePad.

[To Session Table]

Panel Discussion
Time: 15:40 - 17:10 Monday, October 21, 2019
Moderator: Hung-Ming Chen (National Chiao Tung University, Taiwan)

D-1 (Time: 15:40 - 17:10)
Title(Panel Discussion) Quo Vadis, EDA?
AuthorModerator: Hung-Ming Chen (National Chiao Tung University, Taiwan), Panelists: Krishnendu Chakrabarty (Duke University, USA), Ulf Schlichtmann (Technische Universität München, Germany), Toshihiro Hattori (Renesas Electronics, Japan), Pai H. Chou (National Tsing Hua University, Taiwan), Akira Fujimaki (Nagoya University, Japan), Donald Lie (Texas Tech University, USA), Organizer: Tsung-Yi Ho (National Tsing Hua University, Taiwan)
Pagep. 174
AbstractNowadays electronics and biomedical designs/applications have been facing critical moments, including the end/extension of Moore's law, killer applications and sustainability issues, etc. How to leverage all possible solutions in design and tools development including the employment of AI is thus essential. In this panel, we have six international researchers leading the discussion in the fields of biomedical, optical designs, automotives, 5G/IoTs, and quantum computing, figuring out how EDA can help shape the future designs.

Tuesday, October 22, 2019

[To Session Table]

Keynote Speech II
Time: 9:20 - 10:20 Tuesday, October 22, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

K2-1 (Time: 9:20 - 10:20)
Title(Keynote Speech) EDA for Optical Networks-on-Chip (ONoCs): Achievements and Future Opportunities
Author*Ulf Schlichtmann (Technische Universität München, Germany)
Pagep. 175
KeywordOptical NoCs
AbstractOptical Networks on Chip (ONoCs) are a promising technology to resolve some issues which are increasingly plaguing traditional electrical NoCs. Excessive power consumption is chief among these issues. As researchers started looking into architectural options for ONoCs, it soon became apparent that Electronic Design Automation (EDA) would be very beneficial to improve such architectures and especially their physical implementation, e.g. due to the complexity involved. This is true already on a netlist level, but even more so once physical design is considered. Thus, since about 10 years, researchers have started working on EDA approaches for the design of ONoCs. I will review some achievements of EDA for ONoCs, with a focus on physical design (placement, routing). I will discuss current challenges in further improving EDA results. This will be followed by a look at opportunities how EDA research can further improve ONoC architectures. Opportunities exist especially in simultaneously considering multiple design aspects. The emphasis in this talk will be on Wavelength-Routed ONoCs (WRONoCs).

[To Session Table]

Regular Poster Session III
Time: 10:20 - 11:50 Tuesday, October 22, 2019
Chairs: Yukihide Kohira (University of Aizu, Japan), Yasuhiro Takashima (University of Kitakyushu)

Outstanding Paper Awards
R3-1 (Time: 10:20 - 10:22)
TitleEfficiency Investigation of Capacitors Mounted on Re-distribution Layers for FOWLP
Author*Koki Kasai, Atsushi Kurokawa, Masashi Imai, Toshiki Kanamoto (Hirosaki University, Japan)
Pagepp. 176 - 179
KeywordPDN, Impedance, Capacitance, FOWLP
AbstractThis paper provides insights on effective usage of an emerging decoupling capacitor. Power supply noise is one of the most serious concerns in the modern low voltage integrated circuits. Decoupling capacitors embedded in the re-distribution layers (RDL) are potentially effective to reduce the noise caused by the internal switching. However, the effectiveness of them is easily lost due to the equivalent series inductance and resistance. Here, we construct a post-layout simulation test bench to discuss the effectiveness by evaluating impedance profile as well as transient noise waveform. The experimental results show that the horizontal proximity of the RDL embedded capacitors to the noise source is an important factor to keep the advantage.

R3-2 (Time: 10:22 - 10:24)
TitleUnbalanced Splitting Tolerant Sample Preparation Algorithm for Digital Microfluidic Biochips
AuthorLing-Yen Song, Yi-Ling Chen, Yung-Chun Lei, *Juinn-Dar Huang (Institute of Electronics, National Chiao Tung University, Taiwan)
Pagepp. 180 - 183
Keyworddigital microfluidic biochip, sample preparation, unbalanced splitting, probability-based forecast, forecast-based correction
AbstractSample preparation is regarded as one of necessary processing steps in most biochemical assays. In the past decade, several techniques have been presented to deal with sample preparation issues under the (1:1) mixing model on digital microfluidic biochips (DMFBs). Most of previous works assumed that mixing-then-splitting would get two identical output droplets. However, due to uncontrollable variabilities, previous works may fail to provide exact solutions as the present of unbalanced splitting. In this paper, we propose a new forecast-based correction algorithm for unbalanced splitting problem. Our new algorithm not only guarantees a correct solution, but requires neither extra reactants nor on-chip specialized hardware. Experimental results show that the effect of unbalanced splitting can be eliminated only at the cost of 20% more operation steps. Therefore, the proposed algorithm is both reliable and efficient.

R3-3 (Time: 10:24 - 10:26)
TitleKR-CHIP: An Educational Computer equipped with 8-bit Accumulator-based, 16-bit Accumulator-based and 32-bit Pipeline Processors
AuthorHiroyuki Kanbara (ASTEM RI, Japan), Kagumi Azuma, Yuuki Oosako (Kwansei Gakuin University, Japan), Atsuya Shibata (Nara Institute of Science and Technology, Japan), *Wakako Nakano (Kwansei Gakuin University, Japan)
Pagepp. 184 - 189
KeywordEducation, CPU, FPGA, Accumulator-based, Pipeline
AbstractThis article presents a processor for computer education named KR-CHIP. KR-CHIP integrates 3 CPUs: 8-bit accumulator-based, 16-bit accumulator-based and 32-bit pipeline architecture. Every register, counter, flag and memory can be observed directly by hardware at any clock cycle or at any phase of instruction execution. KR-CHIP is useful for beginners of computer hardware to understand how instructions are processed inside a CPU.

R3-4 (Time: 10:26 - 10:28)
TitleA Trial of Electric Chemical Degradation Process Simulation for Lead-acid Batteries
Author*Daiki Imai, Masahiro Fukui (Ritsumeikan University, Japan), Keiichi Hasegawa (Plan Be, Japan)
Pagepp. 190 - 191
KeywordBattery Management, Simulation, Optimization, Lead-acid Battery
AbstractA trial of computer simulation for degradation of lead-acid battery is examined by the concepts of reaction distance. The recovery rate depends on the time of charge after discharge, the reaction distance, and the particle diameter of PbSO4 salts.

R3-5 (Time: 10:28 - 10:30)
TitleRegister Minimization in Double Modular Redundancy Design with Soft Error Correction by Replay
Author*Yuya Kitazawa (Saitama University, Japan), Shinichi Nishizawa (Fukuoka University, Japan), Kazuhito Ito (Saitama University, Japan)
Pagepp. 192 - 197
KeywordDouble modular redundancy, soft error, register minimization
AbstractDouble modular redundancy (DMR) is to execute an operation twice and detect soft error by comparing the duplicated operation results. The soft error is corrected by executing necessary operations again, called replay. The replay requires error-free input data and registers are needed to store such necessary error-free data. In this paper, a method to minimize the required number of registers is proposed where replay intervals are appropriately selected so as not to increase the register requirement. The experimental results show up to 27% reduction of required registers.

R3-6 (Time: 10:30 - 10:32)
TitleComparison of Diagnostic Performance Metrics for Test Point Selection in Analog Circuits
Author*Koutaro Hachiya (Teikyo Heisei University, Japan), Atshushi Kurokawa (Hirosaki University, Japan)
Pagepp. 198 - 203
KeywordAnalog Test, Diagnostic Performance Metric, 3D-IC, Through Silicon Via
AbstractDiagnostic performance metrics proposed in literature for finding measurement points in analog circuits are compared in terms of four properties: related to test metrics, sensitivity, symmetric and parametric. According to the comparison result, the guideline for metrics selection is proposed. As a case study, the metrics are applied to finding measurement points to detect open defects of through silicon vias in power distribution networks of 3D-ICs.

R3-7 (Time: 10:32 - 10:34)
TitleA 12-bit 500-kS/s SAR ADC with Reconfigurable Mismatch Tolerance
Author*Yu-Hsiang Nien, Tsung-Heng Tsai (National Chung Cheng University, Taiwan)
Pagepp. 204 - 207
KeywordSAR ADC
AbstractThis paper presents an energy-efficient 12-bit 500-kS/s SAR ADC with reconfigurable mismatch tolerance for high-resolution wearable biomedical sensor networks. Switching-back is used to create a tolerance range of 1/4Vref per bit. Reconfigurable mismatch tolerance (RTM) is assigned for each bit independently to compensate process variations. In this work, the unit capacitance is 1 fF. This SAR ADC consumes 39.5 μW at 500-kS/s under a 1 V supply in 65 nm CMOS process. It achieves a signal-to-noise and distortion ratio of 64.79 dB. The effective number of bits (ENOB) is 10.4 bits, resulting in figure of merit of 55.6 fJ/conversion-step. The implemented prototype occupies an active area of 0.178 mm2.

R3-8 (Time: 10:34 - 10:36)
TitleHigh-level synthesis code optimization with loop fusion based on LLVM/Polly
Author*Yuta Hiyama, Takayuki Todokoro, Kenshu Seto (Tokyo City University, Japan), Masato Tatsuoka (Socionext Inc., Japan Advanced Institute of Science and Technology, Japan), Yoshihito Nishida (Socionext Inc., Japan), Mineo Kaneko (Japan Advanced Institute of Science and Technology, Japan)
Pagepp. 208 - 213
KeywordLoop fusion, Polyhedral model, High-level synthesis, LLVM, Polly
AbstractLoop fusion is an effective loop optimization for high-level synthesis. Loop fusion can be performed automatically with an LLVM-based polyhedral compiler called Polly. However, Polly's loop fusion algorithm may output a loop structure unsuitable for high-level synthesis. We implemented an algorithm that uses Polly to output a loop structure suitable for high-level synthesis. The proposed method reduced the average number of execution cycles for high-level synthesis by 33.4% compared to that before loop fusion.

R3-9 (Time: 10:36 - 10:38)
TitleUltra Low Current Measurement with On-chip High Resistance of MOSFET Array
Author*Xinghuai Zhang, Daishi Isogai, Takaaki Shirakawa, Shigetoshi Nakatake (The University of Kitakyushu, Japan)
Pagepp. 214 - 217
KeywordOn-chip High Resistance, Ultra Low Current, Sensor
AbstractWe propose on-chip high resistance using MOSFET array. We adopt the potentiostat method as an electrochemical sensing to measure ultra low current being aware of biosensing and implant sensing. The sensor circuit includes a high resistance array which is configured by connecting unit resistors in series and parallel. We verify the DC characteristics, the area, and the temperature characteristics of the resistor array by the SPICE simulation, then demonstrate the promising result compared with the conventional Poly resistance

R3-10 (Time: 10:38 - 10:40)
TitleA Note on Optimization Algorithms for FF/Latch-Based High-Level Synthesis
Author*Keisuke Inoue (International College of Technology, Kanazawa, Japan)
Pagepp. 218 - 222
Keywordhigh-level synthesis, latch
AbstractThis paper presents a new design framework for register-transfer-level data-paths. The conventional D-flip-flop-based register (D-REG) is very practical, since the designers can concentrate only on the timing constraints between registers. However, with the development of deep sub-micron technology and the increase in the data length, the D-REG hardware cost is becoming relatively larger than the other hardware resources. Thus, latch-based design methods have been proposed as alternatives to D-REG-based design methods, since the latch-based register has smaller hardware cost than D-REG. A disadvantage of the conventional latch-based architecture is the increase in the hardware resources. As a result, the total register cost cannot be fully reduced. We propose a new design framework, a kind of level-triggered latch design, in which a D-REG is replaced by a pair of latch-based registers: a master latch-based register (M-REG) and a slave latch-based register (S-REG).

R3-11 (Time: 10:40 - 10:42)
TitleFPGA Implementation for WDF-Based Analog Emulator with Complicated Topology
AuthorHsin-Ju Hsu (National Chiao Tung University, Taiwan), Ji-Xuan Tsai, Meng-Lin Li (National Central University, Taiwan), *Chien-Nan Liu (National Chiao Tung University, Taiwan), Jing-Yang Jou (National Central University, Taiwan)
Pagepp. 223 - 226
KeywordWDF, analog emulation, FPGA, system verification
AbstractSystem verification is still a big challenge for system-on-chip (SoC) designs with AMS circuits. Wave digital filter (WDF)-based approach is a possible solution to emulate analog circuits in existing FPGA with digital circuits. In order to solve the loop problem in WDF structures, a special J-type adaptor was proposed. However, the automatic transformation flow and corresponding FPGA implementation flow with this new J-type adaptor is not discussed in previous papers. Therefore, this paper focuses on the hardware implementation issues for WDF-based analog emulators with J-type adaptor. The FPGA results on several circuits with nonlinear elements have demonstrated the effectiveness and feasibility of the proposed solution for supporting various circuit types on an FPGA-based platform.

R3-12 (Time: 10:42 - 10:44)
TitleBinary Synthesis from RISC-V Executables
Author*Shoki Hamana, Nagisa Ishiura (Kwansei Gakuin University, Japan)
Pagepp. 227 - 228
Keywordhigh-level synthesis, binary synthesis, RISC-V
AbstractThis paper presents an implementation of a binary synthesizer which converts a given executable binary code of RISC-V into hardware functionally equivalent to a RISC-V core executing the code. A CPU core and an instruction memory are replaced by the synthesized hardware, which reduces execution time and hardware size for small scale programs. A given binary code is disassembled and parsed to build a control dataflow graph (CDFG), then traditional high-level synthesis techniques are applied to generate RT level Verilog HDL. For a small example program consisting of 34 through 160 instructions, synthesized hardware on Xilinx FPGA Artix-7 took about 74.5% less cycles than on RISC-V Rocket core, with smaller number of LUTs.

R3-13 (Time: 10:44 - 10:46)
TitleDetection of Vulnerability Guard Elimination by Compiler Optimization Based on Binary Code Comparison
Author*Yuka Azuma, Nagisa Ishiura (Kwansei Gakuin University, Japan)
Pagepp. 229 - 230
Keywordsoftware security, compiler optimization, undefined behavior, binary comparison, buffer overflow
AbstractIt is known that guards against vulnerabilities in C programs might be eliminated by compiler optimization if they are not written properly. This paper proposes a method to detect such flaws in software by binary code comparison. Given a source code, a pair of binary codes are generated, one with standard optimization and the other with problematic optimization suppressed. Since simple comparison of the binary codes end up with an unacceptable amount of false positives, call instructions in each function are collated to detect discrepancies. In a preliminary experiment on 7 programs, our method successfully detected 2 instances of guard losses with only one false positive.

R3-14 (Time: 10:46 - 10:48)
TitleA Stable Equivalent Circuit Identification Algorithm for Li ion Batteries
Author*Lei Lin, Masahiro Fukui (Ritsumeikan University, Japan)
Pagepp. 231 - 236
KeywordSOC, Estimation, Parameter, RLS, EKF
AbstractThis paper discusses the equivalent circuit parameter and state synchronous estimation method for Li-ion battery. In the conventional method, accuracy and stability are hard to improve. In order to solve this problem, we proposed a solution of the equivalent circuit parameter and state synchronous estimation with feedback. In this paper, we will introduce to the effectiveness of this solution through experiments.

R3-15 (Time: 10:48 - 10:50)
TitleAn Intravesical Urine Volume Sensor Robust to Body Posture and Movement
Author*Ryousuke Sakai, Shigetoshi Nakatake (The University of Kitakyushu, Japan)
Pagepp. 237 - 238
KeywordBiomedeical sensor, AC impedance method, Interavesical urine volume, IoT device
AbstractIn this work, in order to prevent urinary incontinence, we aim to estimate the urination condition from the body water amount in the vicinity of the bladder. Our sensor has a good robustness to body posture and movement by applying the AC impedance method to the bladder. We implement an impedance-based prototype system and experiment to estimate intravesical urine volume. As a result, we are confirmed that the impedance value decreased according to time after drinking water. In addition, we compare the measurement results with the commercial ultrasonic monitoring system and verify the robustness of our proposed system to body posture and movement.

R3-16 (Time: 10:50 - 10:52)
TitleTest Pattern Generation for Timing Faults in Rapid Single-Flux-Quantum Circuits
Author*Kazuyoshi Takagi (Mie University, Japan), Mikihiro Ono (Kyoto University, Japan), Nobutaka Kito (Chukyo University, Japan), Naofumi Takagi (Kyoto University, Japan)
Pagepp. 239 - 243
KeywordSuperconducting RSFQ circuits, test pattern generation, timing faults, fault detection, fault diagnosis
AbstractA new fault model and test pattern generation methods considering characteristics of superconducting Rapid Single-Flux-Quantum (RSFQ) logic circuits are presented. We define a timing fault model for RSFQ circuits by focusing on the order of pulse arrivals at each clocked logic gate. Subject to the fault model, we propose test pattern generation methods for fault detection and fault diagnosis of RSFQ circuits.

R3-17 (Time: 10:52 - 10:54)
TitleIncremental Approaches for Locating Design Errors: Averaging EPI-Groups and Generating Additional Input Patterns
Author*Shogo Ohmura, Hiroshi Nakano, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan)
Pagepp. 244 - 249
Keyworderror diagnosis, ECO, PLEM, EPI
AbstractThis paper presents two kinds of incremental approaches for locating design errors: averaging EPI-groups and generating additional input patterns to reduce EPI values used for extraction of error location sets in order to shorten the processing time. The experimental results have shown that the proposed techniques are effective to reduce the number of initial error location sets by 96.8% or more, and to shorten the processing time by 86.6% or more.

[To Session Table]

Invited Talk II
Time: 13:20 - 14:10 Tuesday, October 22, 2019
Chair: Chia-Heng Tu (National Cheng Kung University, Taiwan)

I2-1 (Time: 13:20 - 14:10)
Title(Invited Talk) IoT for Enabling Precision Medicine
Author*Pai H. Chou (National Tsing Hua University, Taiwan)
Pagep. 250
AbstractIoT technologies have the potential of revolutionizing medicine by enabling precision diagnostics and treatment. Medical misdiagnoses are frequently caused by over-reliance on patients' biased recall and by measurement limited to the clinical settings. Doctors also have little control over follow-up treatment prescribed for outside the clinic. These limitations can be overcome by a combination of wearable medical and non-medical IoT devices that produce objective, unbiased data from or around the patient. This talk presents a number of case studies on the design of such IoT devices to enable precision medicine, including cardiovascular and pulmonary applications.

[To Session Table]

Regular Poster Session IV
Time: 14:10 - 15:40 Tuesday, October 22, 2019
Chairs: Chien-Nan Liu (National Chiao Tung University, Taiwan), Lih-Yih Chiou (National Cheng Kung University, Taiwan)

Outstanding Paper Awards
R4-1 (Time: 14:10 - 14:12)
TitleA Case Study on Design of Approximate Multipliers for MNIST CNN
Author*Kenta Shirane, Takahiro Yamamoto, Hiroyuki Tomiyama (Ritsumeikan University, Japan)
Pagepp. 251 - 255
KeywordApproximate Computing, Approximate Multiplier, CNN, MNIST
AbstractIn this paper, we present a case study on approximate multipliers for MNIST CNN. We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST recognition, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. We further reduce area and delay of the multipliers with keeping high accuracy in MNIST CNN.

R4-2 (Time: 14:12 - 14:14)
TitleA Layout Design Method of QCA without Fixing Data Flow
Author*Kazuki Morita, Wakaki Hattori, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 256 - 261
KeywordQuantum-dot Cellular Automata, clocking scheme, Field-Coupled Nanotechnology
AbstractQuantum-dot Cellular Automata (QCA) is a promising nanotechnology with ultra-low power consumption and high clock rates. Thus, QCA overcomes the physical limitation of conventional technologies like CMOS and it is an alternative technology to maintain Moore's law. Pre-planned zone clocking schemes are proposed in order to facilitate a design of a QCA circuit. In a QCA circuit designed with a pre-planned zone clocking scheme, data flows are predetermined; it leads to an increase of a circuit area. To solve this problem, this paper proposes a new approach to fnd an efficient data flow for a circuit. Experimental results show the usefulness of the proposed method.

R4-3 (Time: 14:14 - 14:16)
TitleAn Error Diagnosis Technique Using ZDD to Extract Error Location Sets
Author*Hiroshi Nakano, Shogo Ohmura, Nobutaka Kuroki (Kobe University, Japan), Tetsuya Hirose (Osaka University, Japan), Masahiro Numa (Kobe University, Japan)
Pagepp. 262 - 267
Keyworderror diagnosis, ZDD, ECO
AbstractThis paper presents an error diagnosis technique using ZDD (zero-suppressed binary decision diagram) to extract error location sets. A ZDD represents error location sets implicitly, which reduces processing time to extract them. Experimental results have shown that the proposed technique reduces the processing time by 92.4% in average, and the proposed variable ordering technique is effective to reduce ZDD node counts by 86.5% for large circuits.

R4-4 (Time: 14:16 - 14:18)
TitlePerformance Improvements for Block-Flushing
Author*Bao Yifang (Graduate School of Science and Engineering, Ritsumeikan University, Japan), Bing Li (Technical University of Munich, Germany), Tsung-Yi Ho (National Tsing Hua University, Taiwan), Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 268 - 269
KeywordBlock-Flushing, Path Changing, PMD
AbstractDuring execution of the execution of multiple bioassays, some areas on Programmable Microfluidic Devices (PMDs) become contaminated and must be cleaned by washing them with a buffer flow before they are reused. There have been proposed an efficient method for washing called “Block-Flushing.” We show that Block-flushing can make the cleaning work complicated in specific cases and then we propose an improvement of Block-Flushing to alleviate the situation by adjusting flushing paths carefully.

R4-5 (Time: 14:18 - 14:20)
TitleA Proposal of Application Specific Approach with RISC-V Processor on FPGA
Author*Tetsuo Miyauchi, Kiyofumi Tanaka (Japan Advanced Institute of Science and Technology, Japan)
Pagepp. 270 - 273
KeywordRISC-V, FPGA, Processor, Adapting
AbstractCurrently, the number of IoT(Internet of Things) devices is increasing. In IoT devices, small footprint is desirable. RISC-V is an open processor architecture, which is becoming popular for IoT devices. We implemented RISC-V soft processor core, of which instruction set is RV32IM (base implementation and multiple/division in 32 bit registers), on an FPGA with 5-stage pipeline. In this paper, we propose a method for reducing hardware resources by adapting the processor core to an application program. We show our approach can reduce necessary FPGA resources to 14.8% (Rijndael) and 14.4% (Matrix) of the full processor core implementation.

R4-6 (Time: 14:20 - 14:22)
TitleA Study on the Optimization of Asynchronous Circuits During RTL Conversion from Synchronous Circuits
Author*Shogo Semba, Hiroshi Saito (The University of Aizu, Japan)
Pagepp. 274 - 279
Keywordasynchronous circuits, RTL design, optimization
AbstractIn this paper, we propose three optimization methods for asynchronous circuits during the Register Transfer Level (RTL) conversion from synchronous RTL models. The modularization of datapath resources and the restriction of the use of D flip-flops reduce the circuit area while fixing the control signal of the multiplexers reduces the dynamic power consumption. In the experiment, we evaluated the effect of the three optimization methods. The combination of the three optimization methods could reduce the energy consumption 24.6% in the case of a differential equation solver and 12.6% in the case of a tiny encryption algorithm compared to the ones without the proposed optimization methods.

R4-7 (Time: 14:22 - 14:24)
TitleEffect of Reducing the Bit Length of LFSRs for SC
Author*Yudai Sakamoto, Shigeru Yamashita (Graduate School of Science and Engineering, Ritsumeikan University, Japan)
Pagepp. 280 - 285
KeywordStochastic Computing, Stochastic Number, linear-feedback shift register (LFSR)
AbstractStochastic Computing (SC) is an approximation method to calculate functions by using Stochastic Numbers (SNs) which are generated by a linear-feedback shift register (LFSR) and a comparator in general. In this paper, we propose a method to reduce the bit length of LFSRs, and then we verify the errors of the proposed method. We provide some experimental results by which we can confirm that our proposed scheme is very useful.

R4-8 (Time: 14:24 - 14:26)
TitleDesign of Asynchronous Circuits on Commercial FPGAs Using Placement Constraints
Author*Tatsuki Otake, Hiroshi Saito (The University of Aizu, Japan)
Pagepp. 286 - 291
Keywordasynchronous circuits, FPGA, placement constraints
AbstractIn this paper, we propose a design method to implement asynchronous circuits with bundled-data implementation on commercial Field Programmable Gate Arrays (FPGAs) using placement constraints. Using the proposed method, we can obtain the asynchronous circuits whose performance is close to and the energy consumption is smaller (21.3% reduction on average) than the synchronous counterpart with a fewer delay adjustment.

R4-9 (Time: 14:26 - 14:28)
TitleParallelizing SAT-based Coverage-Driven Design Verification
Author*Kiyoharu Hamaguchi (Shimane University, Japan)
Pagepp. 292 - 295
Keyworddesign verification, SAT solver, coverage-driven verification, automated testbench
AbstractWe show results on parallelization of automated coverage-driven verification. In our prior work, we have shown an approach which combines random simulation with input pattern generation using a SAT solver. Experimental results show that the parallelization is promising for achieving higher coverage.

R4-10 (Time: 14:28 - 14:30)
TitleQuantitative Performance Comparison of Asynchronous and Synchronous Comparators
Author*Kyota Akimoto, Toshiki Kanamoto, Atsushi Kurokawa, Masashi Imai (Hirosaki University, Japan)
Pagepp. 296 - 297
Keywordasynchronous circuit, comparator, bundled-data, average performance, hardware merge sorter
AbstractAsynchronous circuits which can achieve average performance thanks to request-and-acknowledge handshaking protocols have a great potential to improve speed performance compared to synchronous circuits. In this paper, we propose a performance efficient circuit structure of a comparator for hardware merge sorters. We evaluate the proposed circuit and its counterpart synchronous circuit using 130nm process technology. As a result, the proposed asynchronous comparator can achieve higher performance than synchronous circuits according to input data.

R4-11 (Time: 14:30 - 14:32)
TitleWire Load Model for Rapid Power Consumption Evaluation in Early Design Stage of Via-Switch FPGA
Author*Asuka Natsuhara, Takashi Imagawa, Hiroyuki Ochi (Ritsumeikan University, Japan)
Pagepp. 298 - 303
Keywordatom switch, reconfigurable architecture, power estimation
AbstractThis paper proposes a wire load model for via-switch FPGA to allow simulation-based power estimation before routing. Via-switch FPGA is expected to achieve a dramatic improvement in the area, delay, and power compared with conventional SRAM-based FPGA. To estimate the power consumption of an application circuit mapped on a via-switch FPGA, time-consuming routing process was needed before circuit simulation. Using the proposed post-placement simulation flow, runtime for power estimation is reduced by 63.8% on average compared with the conventional post-routing simulation flow, with 11.8% degradation of estimation error on average.

R4-12 (Time: 14:32 - 14:34)
TitleClock Tree Modification for Circuits with Programmable Delay Elements
Author*Kota Muroi, Yukihide Kohira (The University of Aizu, Japan)
Pagepp. 304 - 309
Keywordpost-silicon delay tuning, programmable delay element, clock tree modification
AbstractIn this paper, a clock tree modification method for circuits with programmable delay elements (PDEs) is proposed. Since the clock tree is designed without taking PSDT into consideration in existing methods, it may not be suitable for post-silicon delay tuning (PSDT). Our proposed method modifies the clock tree to improve yield in PSDT. Moreover, we propose a design flow for circuits with PDEs so that the design time be shortened and it can be applied to large circuits.

R4-13 (Time: 14:34 - 14:36)
TitleA Study on Updating Spins in Ising Model to Solve Combinatorial Optimization Problems
Author*Yuki Naito, Kunihiro Fujiyoshi (Tokyo University of Agriculture and Technology, Japan)
Pagepp. 310 - 315
KeywordIsing model, Ising computer, combinatorial optimization problem, traveling salesman problem
AbstractIsing model, which consists of spins and interactions of them, is a novel way to solve combinatorial optimization problems, for example, LSI layout problem. The problem is solved by updating the spins stochastically after being mapped to the model. Spins can be updated simultaneously on hardware. However, the problems aren’t solved fast since two spins with interaction should not be updated simultaneously. In this paper, we give a guideline of updating the spins simultaneously to execute high-speed search and confirm it through experiments.

R4-14 (Time: 14:36 - 14:38)
TitleA Fast Hotspot Detector Based on Local Features Using Concentric Circle Area Sampling
Author*Hidekazu Takahashi, Shimpei Sato, Atsushi Takahashi (Department of Information and Communications Engineering, Tokyo Institute of Technology, Japan)
Pagepp. 316 - 321
KeywordDesign for Manufacturability, Lithography Hotspot Detection, Machine Learning
AbstractWith the development of technology nodes, a defective circuit pattern has occurred on a chip. Areas, which may cause defects such as opens/shorts, are called hotspots. In this paper, we propose the hotspot detector based on the probability distribution of feature vectors. Experimental results show that our proposed method achieves 98% accuracy while False Positive Rate is less than 1%, and its computation is 8 times faster than conventional machine learning based methods on ICCAD2012 benchmark suite.

R4-15 (Time: 14:38 - 14:40)
TitleROAD: A Novel Approach for Improving Reliability of Multi-core Systems— How Asymmetric Aging Can Lead a Way
AuthorYu-Guang Chen (National Central University, Taiwan), Jian-Ting Ke (National Cheng Kung University, Taiwan), *Shu-Ting Cheng (Yuan Ze University, Taiwan), Ing-Chao Lin (National Cheng Kung University, Taiwan)
Pagepp. 322 - 323
KeywordAsymmetric Aging, NBTI, multi-core system
AbstractNegative-Bias Temperature Instability (NBTI) has become one of the most drastic reliability threats in modern IC designs. To tolerance NBTI on multi-core systems, previous researchers have proposed various task assignment and/or dynamic voltage frequency scaling algorithms. Most of the proposed methods maintain all cores in the multi-core system under similar aging conditions (symmetric aging). Although these methods can mitigate NBTI, the symmetric aging may reduce the lifetime of a multi-core system. If a critical task (i.e., a task with a tight timing constraint) arrives when the system has already operated for years, it is possible that none of the equivalently aged cores can complete the critical task within its timing constraints. This unavoidable timing failure then will shorten the lifetime of the system. With the above observation, this paper proposes a novel reliability improvement framework which realize the concept of asymmetric aging by task graph Retiming, task Ordering, task Assignment under asymmetric aging, and Dynamic voltage selection (ROAD) for multi-core systems. Experimental results show that our approach can significantly increase the system lifetime with no or insignificant energy overhead.

R4-16 (Time: 14:40 - 14:42)
TitleA Tuning-Free Reservoir of MOSFET Crossbar Array for Inexpensive Hardware Realization of Echo State Network
Author*Yuki Kume, Masayuki Hiromoto, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan)
Pagepp. 324 - 329
Keywordrecurrent neural network, echo state network, reservoir computing, hardware implementation, weight tuning
AbstractEcho state network (ESN) is a class of recurrent neural network, which drastically reduces training time by the use of a reservoir, a random and fixed network as the input and middle layers. In this paper, we propose a hardware implementation of ESN that uses inexpensive MOSFET-based reservoir. As opposed to existing reservoirs that require post tuning of weights for stability improvement, our ESN requires no post parameter tuning. For that purpose, we extend the circular law of random matrix for the sparse reservoirs so that a fixed feedback gain can be determined. Through the evaluations using Mackey-Glass time-series dataset, the proposed ESN achieved stable and successful inference without post parameter tuning.

R4-17 (Time: 14:42 - 14:44)
TitleEstimation of NBTI-Induced Timing Degradation Considering Duty Ratio
Author*Kunihiro Oshima, Song Bian, Takashi Sato (Graduate School of Informatics, Kyoto University, Japan)
Pagepp. 330 - 335
KeywordNegative bias temperature instability, timing degradation sensor, critical path, duty ratio
AbstractWe propose a novel estimation method for NBTI-induced timing degradation that takes the duty ratios of the input signals into account. In the proposed method, the signal propagation delay is evaluated with the proposed replica sensor circuit. With evaluations of the threshold voltage degradation model, delays of critical path candidates are estimated. The simulation results show that the proposed method can reduce the estimation error of critical path delay by 63 % compared to the delay estimation without duty consideration.

R4-18 (Time: 14:44 - 14:46)
TitlePolygon Fracture Method Considering Maximum Shot Size for Variable Shaped-Beam Mask Writing
AuthorMitsuru Hasegawa, *Kunihiro Fujiyoshi (Tokyo University of Agriculture and Technology, Japan)
Pagepp. 336 - 340
Keywordpolygon fracturing, EB writing, mask data, dynamic programming
AbstractSince variable shaped-beam mask writing machines for LSI mask production can expose a rectangle shaped-beam, we need to partition rectilinear polygons in layout into a set of rectangles of the minimum number with considering size limit. In this paper, we propose a new fracturing method for convex rectilinear polygons using dynamic programming, which cuts each polygon by slice-lines through concave vertices firstly. The proposed method can solve the problem in polynomial time. Computer experiments confirm the space and time complexity of the method.

[To Session Table]

Invited Talk III
Time: 15:40 - 16:30 Tuesday, October 22, 2019
Chair: Tsung-Yi Ho (National Tsing Hua University, Taiwan)

I3-1 (Time: 15:40 - 16:30)
Title(Invited Talk) Design and Demonstration of Superconducting Single Flux Quantum Circuits Operating around 50 GHz
Author*Akira Fujimaki (Nagoya University, Japan)
Pagepp. 341 - 342
AbstractWe have been developing very high-speed digital processors including microprocessors based on the superconducting single flux quantum (SFQ) circuit. So far, we have successfully executed programs stored in an embedded memory at 50 GHz in a bit-serial microprocessor and demonstrated an 8-bit-parallel arithmetic logic unit at 50 GHz. These SFQ circuits show extremely high energy-efficiency and high performance compared to semiconductor circuits even if the cooling penalty for superconducting circuits is considered. The SFQ circuit is classified into the pulse logic, in which binary signals ‘1’ and ‘0’ are defined as the presence and absence of a signal impulse between two consecutive clock signals, respectively. The pulse logic is released from the processes of charge-up and discharge for capacitors or inductors, which leads to the features of high-speed operation and low power consumption. The impulses of the SFQ circuits referred to as the SFQ pulse have typical pulse width of 4 ps and pulse height of sub mV. The pulses corresponding to signals and clocks can propagate along superconducting passive transmission lines (PTLs) of the strip line/micro strip line structures at the speed of light with very small distortion, while transmitters and receivers made up of a few Josephson junctions are needed. Special care is required to be paid for designing SFQ circuits, because all the logic gates need clock signals. The setup- and hold-times for logic gates are 4 ps at most, and the accumulated time jitter of signals traveling in very long transmission lines reaches 1 ps. This means that the effective time window of signals to the two consecutive clock signals becomes 10 ps for 50-GHz-operation. Considering the parallel data lines, the timing of the signals including clocks arriving at the logic gates are required to be controlled in a pico-second order. We have been building the top-down design method for SFQ large-scale integrated circuits based on the cell library, in which the bias-voltage-dependent timing parameters such as delay, setup time, hold time are registered for all the logic gates and interconnects.