Change search
ReferencesLink to record
Permanent link

Direct link
Models and Methods for Development of DSP Applications on Manycore Processors
Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS), Centre for Research on Embedded Systems (CERES).
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Advanced digital signal processing systems require specialized high-performance embedded computer architectures. The term high-performance translates to large amounts of data and computations per time unit. The term embedded further implies requirements on physical size and power efficiency. Thus the requirements are of both functional and non-functional nature. This thesis addresses the development of high-performance digital signal processing systems relying on manycore technology. We propose building two-level hierarchical computer architectures for this domain of applications. Further, we outline a tool flow based on methods and analysis techniques for automated, multi-objective mapping of such applications on distributed memory manycore processors. In particular, the focus is put on how to provide a means for tunable strategies for mapping of task graphs on array structured distributed memory manycores, with respect to given application constraints. We argue for code mapping strategies based on predicted execution performance, which can be used in an auto-tuning feedback loop or to guide manual tuning directed by the programmer. Automated parallelization, optimisation and mapping to a manycore processor benefits from the use of a concurrent programming model as the starting point. Such a model allows the programmer to express different types and granularities of parallelism as well as computation characteristics of importance in the addressed class of applications. The programming model should also abstract away machine dependent hardware details. The analytical study of WCDMA baseband processing in radio base stations, presented in this thesis, suggests dataflow models as a good match to the characteristics of the application and as execution model abstracting computations on a manycore. Construction of portable tools further requires a manycore machine model and an intermediate representation. The models are needed in order to decouple algorithms, used to transform and map application software, from hardware. We propose a manycore machine model that captures common hardware resources, as well as resource dependent performance metrics for parallel computation and communication. Further, we have developed a multifunctional intermediate representation, which can be used as source for code generation and for dynamic execution analysis. Finally, we demonstrate how we can dynamically analyse execution using abstract interpretation on the intermediate representation. It is shown that the performance predictions can be used to accurately rank different mappings by best throughput or shortest end-to-end computation latency.

Place, publisher, year, edition, pages
Göteborg: Chalmers University of Technology , 2009. , 173 p.
Series
, Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie, ISSN 0346-718X ; 2969
Keyword [en]
parallel processing, manycore processors, high-performance digital signal processing, dataflow, concurrent models of computation, parallel code mapping, parallel machine model, dynamic performance analysis
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:hh:diva-14706ISBN: 978-91-7385-288-3OAI: oai:DiVA.org:hh-14706DiVA: diva2:408234
Public defence
2009-06-10, Wigforssalen, house Visionen, Halmstad University, Kristian IV:s väg 3, Halmstad, 13:15 (English)
Opponent
Supervisors
Available from: 2011-04-20 Created: 2011-04-04 Last updated: 2011-04-20Bibliographically approved
List of papers
1.
The record could not be found. The reason may be that the record is no longer available or you may have typed in a wrong id in the address field.
2. Baseband Processing in 3G UMTS Radio Base Stations
Open this publication in new window or tab >>Baseband Processing in 3G UMTS Radio Base Stations
2006 (English)Report (Other academic)
Abstract [en]

This report presents a study of functionality, service dataflows, computation characteristics and processing parameters for baseband processing in radio base stations. The study has been performed with the objective to develop a programming model that is natural and efficient to use for baseband programming and which can be efficiently compiled to parallel computing structures. In order to achieve this objective it is necessary to analyse and understand the logical architecture of the application in order to be able to define processing characteristics and thereby requirements on languages as well as on physical system architectures. Moreover, to be able to test and verify programming and mapping of functions it is necessary to have realistic but still manageable test cases. The study is focused on the third generation partnership project (3GPP) standard specifications for 3G radio base stations. The specifications cover the complete 3G network-architecture and are quite extensive and complex. To make experiments manageable, it is necessary to abstract system functionality that is not directly relevant for the RBS baseband processing. Moreover, the standard specifications only describe the required processing functionality on an abstract logical level. In this report, the functionality of the baseband functions is explained and also described using illustrations of dataflows and abstract mapping of two 3G service cases. The results of the study constitute a comprehensive description of the processing flow and the mapping of user data channels in 3G radio base stations – spanning data and control input from layer 2 to physical channel output from layer 1. Data dependencies between functions are illustrated with figures and it is concluded that these dependencies are of producer/consumer type. It is discussed how different functions can be mapped in MIMD and SIMD fashion with regard to the data dependencies, the data stream lengths and the control operations required to handle bit stream processing on word-length processor architectures.

Place, publisher, year, edition, pages
Halmstad: Halmstad University, 2006
Series
, Technical Report, IDE 0629
Keyword
Baseband processing, Radio base stations
National Category
Engineering and Technology
Identifiers
urn:nbn:se:hh:diva-2721 (URN)2082/3123 (Local ID)2082/3123 (Archive number)2082/3123 (OAI)
Available from: 2009-07-06 Created: 2009-07-06 Last updated: 2012-09-22Bibliographically approved
3. A configurable framework for stream programming exploration in baseband applications
Open this publication in new window or tab >>A configurable framework for stream programming exploration in baseband applications
2006 (English)In: 2006 IEEE International Parallel & Distributed Processing Symposium: Rhodes Island, Greece : 25-29 April, 2006, Piscataway, N.J.: IEEE Press, 2006, 8- p.Conference paper (Refereed)
Abstract [en]

This paper presents a configurable framework to be used for rapid prototyping of stream based languages. The framework is based on a set of design patterns defining the elementary structure of a domain specific language for high-performance signal processing. A stream language prototype for baseband processing has been implemented using the framework. We introduce language constructs to efficiently handle dynamic reconfiguration of distributed processing parameters. It is also demonstrated how new language specific primitive data types and operators can be used to efficiently and machine independently express computations on bitfields and data-parallel vectors. These types and operators yield code that is readable, compact and amenable to a stricter type checking than is common practice. They make it possible for a programmer to explicitly express parallelism to be exploited by a compiler. In short, they provide a programming style that is less error prone and has the potential to lead to more efficient implementations.

Place, publisher, year, edition, pages
Piscataway, N.J.: IEEE Press, 2006
Keyword
distributed processing, program compilers, software prototyping, telecommunication, computing, telecommunication signalling
National Category
Engineering and Technology
Identifiers
urn:nbn:se:hh:diva-2104 (URN)10.1109/IPDPS.2006.1639502 (DOI)2-s2.0-33847132885 (ScopusID)2082/2499 (Local ID)1-4244-0054-6 (ISBN)2082/2499 (Archive number)2082/2499 (OAI)
Conference
20th International Parallel and Distributed Processing Symposium, IPDPS 2006, Rhodes Island, Greece : 25-29 April, 2006
Note

©2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Available from: 2008-11-04 Created: 2008-11-04 Last updated: 2014-08-21Bibliographically approved
4. A Domain-specific Approach for Software Development on Manycore Platforms
Open this publication in new window or tab >>A Domain-specific Approach for Software Development on Manycore Platforms
2008 (English)In: SIGARCH Computer Architecture News, ISSN 0163-5964, E-ISSN 0163-5694, Vol. 36, no 5, 2-10 p.Article in journal (Refereed) Published
Abstract [en]

The programming complexity of increasingly parallel processors calls for new tools that assist programmers in utilising the parallel hardware resources. In this paper we present a set of models that we have developed as part of a tool for mapping dataflow graphs onto manycores. One of the models captures the essentials of manycores identified as suitable for signal processing, and which we use as target for our algorithms. As an intermediate representation we introduce timed configuration graphs, which describe the mapping of a model of an application onto a machine model. Moreover, we show how a timed configuration graph by very simple means can be evaluated using an abstract interpretation to obtain performance feedback. This information can be used by our tool and by the programmer in order to discover improved mappings.

Place, publisher, year, edition, pages
New York: ACM Press, 2008
Keyword
Programming, Manycores
National Category
Computer Engineering
Identifiers
urn:nbn:se:hh:diva-5990 (URN)10.1145/1556444.1556446 (DOI)
Conference
Association for Computing Machinery Special Interest Group on Computer Architecture
Available from: 2010-09-23 Created: 2010-09-23 Last updated: 2014-08-21Bibliographically approved
5. Manycore performance analysis using timed configuration graphs
Open this publication in new window or tab >>Manycore performance analysis using timed configuration graphs
2009 (English)In: International Symposium on Systems, Architectures, Modeling, and Simulation, 2009. SAMOS '09 / [ed] Michael Joseph Schulte and Walid Najjar, Piscataway, N.J.: IEEE Press, 2009, 108-117 p.Conference paper (Refereed)
Abstract [en]

The programming complexity of increasingly parallel processors calls for new tools to assist programmers in utilising the parallel hardware resources. In this paper we present a set of models that we have developed to form part of a tool which is intended for iteratively tuning the mapping of dataflow graphs onto manycores. One of the models is used for capturing the essentials of manycores that are identified as suitable for signal processing and which we use as target architectures. Another model is the intermediate representation in the form of a timed configuration graph, describing the mapping of a dataflow graph onto a machine model. Moreover, this IR can be used for performance evaluation using abstract interpretation. We demonstrate how the models can be configured and applied in order to map applications on the Raw processor. Furthermore, we report promising results on the accuracy of performance predictions produced by our tool. It is also demonstrated that the tool can be used to rank different mappings with respect to optimisation on throughput and end-to-end latency.

Place, publisher, year, edition, pages
Piscataway, N.J.: IEEE Press, 2009
Keyword
graphs, microcomputers, parallel architectures, parallel programming, program compilers, software performance evaluation, task analysis
National Category
Computer Engineering
Identifiers
urn:nbn:se:hh:diva-5987 (URN)10.1109/ICSAMOS.2009.5289221 (DOI)000276377000014 ()2-s2.0-71949094275 (ScopusID)978-1-4244-4502-8 (ISBN)
Conference
2009 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2009, Samos, 20 - 23 July, 2009
Note

©2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Available from: 2010-09-23 Created: 2010-09-23 Last updated: 2014-08-21Bibliographically approved

Open Access in DiVA

fulltext(1983 kB)650 downloads
File information
File name FULLTEXT01.pdfFile size 1983 kBChecksum SHA-512
19999f7cb7bdbfee9dcf0822f87977f42012707690251c80e6af261fded0ebadc58fe91fede155803870fa88f3ec265c2c63adaa032fe1644f8e9baee5ff51f0
Type fulltextMimetype application/pdf

Other links

Fulltext

Search in DiVA

By author/editor
Bengtsson, Jerker
By organisation
Centre for Research on Embedded Systems (CERES)
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 650 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 209 hits
ReferencesLink to record
Permanent link

Direct link