ASP-DAC 2016 Technical Program

The 21st Asia and South Pacific Design Automation Conference

Session 6B Energy-Efficient & Customized Computing
Time: 10:20 - 12:00 Thursday, January 28, 2016
Location: TF4304
Chairs: Weichen Liu (Chongqing University, China), Yaoyao Ye (Dept. of Micro/Nano-Electronics, Shanghai Jiao Tong University, China)

6B-1 (Time: 10:20 - 10:45)

Title	Footfall – GPS Polling Scheduler for Power Saving on Wearable Devices
Author	*Kent W. Nixon, Xiang Chen, Yiran Chen (University of Pittsburgh, U.S.A.)
Page	pp. 563 - 568
Keyword	GPS, scheduler, map-matching, wearable
Abstract	Wrist-worn wearable fitness devices, such as FitBit and Apple Watch, have become popular in recent years. Runners can use the GPS embedded in these wearable devices to log the route taken during their exercise, providing vital feedback on pace and distance traveled. Unfortunately, continuous polling for GPS data results in a significant adverse impact on device battery life, e.g., many flagship wearables need to be charged for as frequently as every two days or even less. In this work, we propose Footfall – an intelligent GPS scheduler that can utilize data from alternative sensors on a device to greatly reduce GPS utilization while still maintaining minimum location accuracy. Compared to existing implementations, Footfall system can achieve on average a 75% reduction in total power consumption, while only inducing a 5% discrepancy in location accuracy, which is sufficient for the tar-geted applications.

6B-2 (Time: 10:45 - 11:10)

Title	CP-FPGA: Computation Data-Aware Software/Hardware Co-design for Nonvolatile FPGAs based on Checkpointing Techniques
Author	*Zhe Yuan, Yongpan Liu, Hehe Li, Huazhong Yang (Tsinghua University, China)
Page	pp. 569 - 574
Keyword	FPGA, nonvolatile, checkpoint, IOT
Abstract	With the booming trend of internet of things (IoT), reconfigurable devices, such as FPGAs, have drawn lots of attentions due to their flexible and high-performance capability. However, commercial FPGAs suffer from high leakage power consumption, which makes zero-leakage nonvolatile FPGA (nvFPGA) promising. This paper proposes a hardware/software co-design based nvFPGA with efficient checkpointing strategy. With nonvolatile checkpointing BRAM (CBRAM), it maintains both computation data as well as configuration when power-off to avoid expensive rollbacks due to data loss. A checkpointing location-aware technique is used to balance computation rollback overheads and backup energy. Experimental results show that the proposed checkpointing strategy can reduce 45.8% backup data of nvFPGA when system-level power gating happens.

6B-3 (Time: 11:10 - 11:35)

Title	Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks
Author	Mohammad Motamedi, *Philipp Gysel, Venkatesh Akella, Soheil Ghiasi (University of California, Davis, U.S.A.)
Page	pp. 575 - 580
Keyword	DCNN, Accelerator, Design Space Exploration, Deep Convolutional Neural Network
Abstract	Deep Convolutional Neural Networks (DCNNs) have proven to be very effective in many pattern recognition applications. To meet performance and energy-efficiency constraints, various hardware accelerators have been developed. In this paper, we propose an FPGA-based accelerator which can handle convolutional layers with large hyperparameters. We present a design space exploration algorithm to find the optimal architecture that leverages all sources of parallelism. To the best of our knowledge, we improve the state-of-art for AlexNet on a large FPGA by 1.9X.

6B-4 (Time: 11:35 - 12:00)

Title	LRADNN: High-Throughput and Energy-Efficient Deep Neural Network Accelerator using Low Rank Approximation
Author	*Jingyang Zhu (Hong Kong University of Science and Technology, Hong Kong), Zhiliang Qian (Shanghai Jiao Tong University, China), Chi-Ying Tsui (Hong Kong University of Science and Technology, Hong Kong)
Page	pp. 581 - 586
Keyword	Deep Neural Network, Low Rank Approximation, VLSI archiecture, Energy Efficiency
Abstract	In this work, we propose an energy-efficient hardware accelerator for Deep Neural Network (DNN) using Low Rank Approximation (LRADNN). Using this scheme, inactive neurons in each layer of the DNN are dynamically identified and the corresponding computations are then bypassed. Accordingly, both the memory accesses and the arithmetic operations associated with these inactive neurons can be saved. Therefore, compared to the architectures using the direct feed-forward algorithm, LRADNN can achieve a higher throughput as well as a lower energy consumption with negligible prediction accuracy loss (within 0.1%). We implement and synthesize the proposed accelerator using TSMC 65nm technology. From the experimental results, an energy reduction ranging from 31% to 53% together with an increase in the throughput from 22% to 43% can be achieved.