Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient Single-Precision Floating-Point Division Using Harmonized Parabolic Synthesis
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES).ORCID iD: 0000-0001-8652-0098
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES).ORCID iD: 0000-0003-4828-7488
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES).ORCID iD: 0000-0002-0562-2082
Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES).ORCID iD: 0000-0002-4932-4036
2017 (English)In: 2017 IEEE Computer Society Annual Symposium on VLSI: ISVLSI 2017 / [ed] Michael Hübner, Ricardo Reis, Mircea Stan & Nikolaos Voros, Los Alamitos: IEEE, 2017Conference paper, Published paper (Refereed)
Abstract [en]

This paper proposes a novel method for performing division on floating-point numbers represented in IEEE-754 single-precision (binary32) format. The method is based on an inverter, implemented as a combination of Parabolic Synthesis and second-degree interpolation, followed by a multiplier. It is implemented with and without pipeline stages individually and synthesized while targeting a Xilinx Ultrascale FPGA.

The implementations show better resource usage and latency results when compared to other implementations based on different methods. In case of throughput, the proposed method outperforms most of the other works, however, some Altera FPGAs achieve higher clock rate due to the differences in the DSP slice multiplier design.

Due to the small size, low latency and high throughput, the presented floating-point division unit is suitable for high performance embedded systems and can be integrated into accelerators or be used as a stand-alone accelerator.

Place, publisher, year, edition, pages
Los Alamitos: IEEE, 2017.
Series
IEEE Computer Society Annual Symposium on VLSI, ISSN 2159-3477
Keyword [en]
Floating-point, single precision, division, FPGA, Harmonized Parabolic Synthesis
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:hh:diva-33793DOI: 10.1109/ISVLSI.2017.28ISBN: 978-1-5090-6762-6 (electronic)ISBN: 978-1-5090-6763-3 (print)OAI: oai:DiVA.org:hh-33793DiVA: diva2:1093338
Conference
IEEE Computer Society Annual Symposium on VLSI, July 3-5, 2017, Bochum, Germany
Projects
NGES
Funder
VINNOVA
Available from: 2017-05-05 Created: 2017-05-05 Last updated: 2017-08-08Bibliographically approved
In thesis
1. Utilizing Heterogeneity in Manycore Architectures for Streaming Applications
Open this publication in new window or tab >>Utilizing Heterogeneity in Manycore Architectures for Streaming Applications
2017 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

In the last decade, we have seen a transition from single-core to manycore in computer architectures due to performance requirements and limitations in power consumption and heat dissipation. The first manycores had homogeneous architectures consisting of a few identical cores. However, the applications, which are executed on these architectures, usually consist of several tasks requiring different hardware resources to be executed efficiently. Therefore, we believe that utilizing heterogeneity in manycores will increase the efficiency of the architectures in terms of performance and power consumption. However, development of heterogeneous architectures is more challenging and the transition from homogeneous to heterogeneous architectures will increase the difficulty of efficient software development due to the increased complexity of the architecture. In order to increase the efficiency of hardware and software development, new hardware design methods and software development tools are required. Additionally, there is a lack of knowledge on the performance of applications when executed on manycore architectures.

The transition began with a shift from single-core architectures to homogeneous multicore architectures consisting of a few identical cores. It now continues with a shift from homogeneous architectures with identical cores to heterogeneous architectures with different types of cores specialized for different purposes. However, this transition has increased the complexity of architectures and hence the complexity of software development and execution. In order to decrease the complexity of software development, new software tools are required. Additionally, there is a lack of knowledge on what kind of heterogeneous manycore design is most efficient for different applications and what are the performances of these applications when executed on current commercial manycores.

This thesis studies manycore architectures in order to reveal possible uses of heterogeneity in manycores and facilitate choice of architecture for software and hardware developers. It defines a taxonomy for manycore architectures that is based on the levels of heterogeneity they contain and discusses benefits and drawbacks of these levels. Additionally, it evaluates several applications, a dataflow language (CAL), a source-to-source compilation framework (Cal2Many), and a commercial manycore architecture (Epiphany). The compilation framework takes implementations written in the dataflow language as input and generates code targetting different manycore platforms. Based on these evaluations, the thesis identifies the bottlenecks of the architecture. It finally presents a methodology for developing heterogeneoeus manycore architectures which target specific application domains.

Our studies show that using different types of cores in manycore architectures has the potential to increase the performance of streaming applications. If we add specialized hardware blocks to a core, the performance easily increases by 15x for the target application while the core size increases by 40-50% which can be optimized further. Other results prove that dataflow languages, together with software development tools, decrease software development efforts significantly (25-50%) while having a small impact (2-17%) on the performance.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2017. 78 p.
Series
Halmstad University Dissertations, 29
Keyword
Manycores, parallel architectures, parallelism, streaming applications, dataflow, manycore design, heterogeneous manycores
National Category
Computer Systems
Identifiers
urn:nbn:se:hh:diva-33792 (URN)978-91-87045-60-8 (ISBN)978-91-87045-61-5 (ISBN)
Presentation
2017-06-02, Wigforss, Kristian IV:s väg 3, Halmstad, 13:15 (English)
Opponent
Supervisors
Projects
HiPEC (High Performance Embedded Computing)NGES (Towards Next Generation Embedded Systems: Utilizing Parallelism and Reconfigurability)
Funder
VINNOVASwedish Foundation for Strategic Research
Available from: 2017-05-09 Created: 2017-05-05 Last updated: 2017-05-09Bibliographically approved

Open Access in DiVA

Savas2017Efficient(866 kB)4 downloads
File information
File name FULLTEXT01.pdfFile size 866 kBChecksum SHA-512
605b19760c2917d54e88176de177862ba5f3f3704e310f23fa4e6b97e3ca7e641dbd1174a2ff966c51f0b850e349a337269efe6547b3c994c7524caed2dd4b68
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Savas, SüleymanHertz, ErikNordström, TomasUl-Abdin, Zain
By organisation
Centre for Research on Embedded Systems (CERES)
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 4 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 62 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf