Change search
Refine search result
1234567 101 - 150 of 633
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 101. Daneshtalab, M.
    et al.
    Hemani, Ahmed
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Palesi, M.
    Message from the chairs2013In: MES '13Proceedings of the first International Workshop on Many-core Embedded Systems, 2013Conference paper (Refereed)
  • 102.
    Daneshtalab, Masoud
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems. University of Turku, Finland .
    Palesi, Maurizio
    Mak, Terrence
    Introduction to the Special Issue on Network-on-Chip Architectures2014In: Computers & electrical engineering, ISSN 0045-7906, E-ISSN 1879-0755, Vol. 40, no 8, p. 257-259Article in journal (Other academic)
  • 103. Daneshtalab, Masoud
    et al.
    Palesi, Maurizio
    Plosila, Juha
    Hemani, Ahmed
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Special issue on many-core embedded systems2014In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 38, no 6, p. 525-525Article in journal (Other academic)
  • 104. Deb, Abhijit K.
    et al.
    Öberg, Johnny
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Simulation and Analysis of Embedded DSP Systems using Petri Nets2003In: Proceedings of the 14th IEEE International Workshop on Rapid System Prototyping, 2003Conference paper (Refereed)
  • 105. Deb, Abhijit K.
    et al.
    Öberg, Johnny
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Control and Communication Performance Analysis of Embedded DSP Systems in the MASIC Methodology2001In: Proceedings of the International Symposium on System Synthesis (ISSS), 2001Conference paper (Refereed)
  • 106. Deb, Abhijit K.
    et al.
    Öberg, Johnny
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Simulation and Analysis of Embedded DSP Systems using MASIC Methodology2003In: Proceedings of the Design Automation and Test Europe (DATE), 2003Conference paper (Refereed)
  • 107.
    Deb, Abhijit Kumar
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Hemani, Ahmed
    Öberg, Johnny
    A Heterogeneous Modeling Environment of DSP Systems Using Grammar Based Approach2000In: 3rd Forum on Design Languages (FDL-2000), 2000, p. 365-370Conference paper (Refereed)
  • 108.
    Deb, Abhijit Kumar
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Hemani, Ahmed
    Öberg, Johnny
    Grammar based Design and Verification of an LPC Speech Codec: A Case Study2000In: Int. Conf. on Signal Processing Applications & Technology (ICSPAT-2000), 2000Conference paper (Refereed)
  • 109. Deivasigamani, M.
    et al.
    Tabatabaei, Shaghayeghsadat
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Mustafa, Naveed Ul
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Ijaz, Hamza
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Aslam, Haris Bin
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Liu, Shaoteng
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Concept and design of exhaustive-parallel search algorithm for Network-on-Chip2011In: Int. Syst. Chip Conf., 2011, p. 150-155Conference paper (Refereed)
    Abstract [en]

    This paper presents the concept and design of exhaustive-parallel search algorithm for Network-on-Chip. The proposed parallel algorithm searches minimal path between source and destination in a forward-wave-propagation manner. The algorithm guarantees setup latency if the setup path exists. A high performance switch is designed to support exhaustive-parallel search algorithm. The NoC fabric is designed for 88 mesh architecture and its performance is evaluated.

  • 110.
    Dhaou, I. Ben
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Tenhunen, Hannu
    KTH, Superseded Departments, Electronic Systems Design.
    Power Efficient Inter-Mode Communication for Digit-Serial DSP Architectures in Deep-Submicron Technology2001In:  , 2001, p. 61-66Conference paper (Refereed)
  • 111. Ditmar, Johan
    et al.
    Torkelsson, Kjell
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Dynamically Reconfigurable FPGA-based Content Addressable Memory for Internet Protocol Characterization2000In: Proceedings of the 10th International Conference on Field Programmable Logic and Applications, 2000, Vol. 1896, p. 19-28Conference paper (Refereed)
  • 112. Du, G.
    et al.
    Li, M.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Gao, M.
    Wang, C.
    An analytical model for worst-case reorder buffer size of multi-path minimal routing NoCs2014In: Proceedings - 2014 8th IEEE/ACM International Symposium on Networks-on-Chip, NoCS 2014, IEEE , 2014, p. 49-56Conference paper (Refereed)
    Abstract [en]

    Reorder buffers are often needed in multi-path routing networks-on-chips (NoCs) to guarantee in-order packet delivery. However, the buffer sizes are usually over-dimensioned, due to lack of worst-case analysis, leading to unnecessary larger area overhead. Based on network calculus, we propose an analysis framework for the worst-case reorder buffer size in multi-path minimal routing NoCs. Experiments with synthetic traffic and an industry case show that our method can effectively explore the traffic splitting space, as well as the mapping effects in terms of reorder buffer size with a maximum improvement of 36.50%.

  • 113. Du, G.
    et al.
    Zhang, C.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Saggio, A.
    Gao, M.
    Worst-case performance analysis of 2-D mesh NoCs using multi-path minimal routing2012In: CODES+ISSS'12 - Proceedings of the 10th ACM International Conference on Hardware/Software-Codesign and System Synthesis, Co-located with ESWEEK, ACM , 2012, p. 123-132Conference paper (Refereed)
    Abstract [en]

    In Network-on-Chip (NoC), multi-path routing is often preferable than single-path routing since it can better balance workload and thus provide better performance. However, performance analysis with multi-path routing is much more difficult due to complicated contention scenarios. Based on network calculus, we study worst-case performance of deterministic multi-path minimal routing on 2-D mesh NoCs. We first present a per-flow delay bound analysis technique for multi-path routing, which extends the analysis for singlepath routing but deals with traffic splitting. Then we define a contention matrix to capture network congestion status. Based on the contention matrix, we propose an effective nonuniform traffic splitting strategy to improve worst-case performance. Experiments with synthetic traffic flows and an industrial case show that our analysis can effectively explore the traffic splitting space, and verify the effectiveness of the non-uniform splitting policy.

  • 114.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Design Technique for High-Performance Self-Checking Combinational Circuits2001In: Proceedings of IEEE European Test Workshop, 2001, p. 11-15Conference paper (Refereed)
  • 115.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Method for Generating Full Cycles by a Composition of NLFSRs2014In: Designs, Codes and Cryptography, ISSN 0925-1022, E-ISSN 1573-7586, Vol. 73, no 2, p. 469-486Article in journal (Refereed)
    Abstract [en]

    Non-linear feedback shift registers (NLFSRs) are a generalization of linear feedback shift registers in which a current state is a non-linear function of the previous state. The interest in NLFSRs is motivated by their ability to generate pseudo-random sequences which are typically hard to break with existing cryptanalytic methods. However, it is still not known how to construct large -stage NLFSRs which generate full cycles of possible states. This paper presents a method for generating full cycles by a composition of NLFSRs. First, we show that an -stage register with period can be constructed from NLFSRs with -stages by adding to their feedback functions a logic block of size , for . This logic block implements Boolean functions representing pairs of states whose successors have to be exchanged in order to join cycles. Then, we show how to join all cycles into one by using one more logic block of size O(nk).

  • 116.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Scalable Method for Constructing Galois NLFSRs With Period 2(n)-1 Using Cross-Join Pairs2013In: IEEE Transactions on Information Theory, ISSN 0018-9448, E-ISSN 1557-9654, Vol. 59, no 1, p. 703-709Article in journal (Refereed)
    Abstract [en]

    A method for constructing n-stage Galois NLFSRs with period 2(n) - 1 from n-stage maximum length LFSRs is presented. Nonlinearity is introduced into state cycles by adding a non-linear Boolean function to the feedback polynomial of the LFSR. Each assignment of variables for which this function evaluates to 1 acts as a crossing point for the LFSR state cycle. The effect of non-linearity is cancelled and state cycles are joined back by adding a copy of the same function to a later stage of the register. The presented method requires no extra time steps and it has a smaller area overhead compared to the previous approaches based on cross-join pairs. It is feasible for large n.

  • 117.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    An equivalence-preserving transformation of shift registers2014In: Sequences and Their Applications - SETA 2014: 8th International Conference, Melbourne, VIC, Australia, November 24-28, 2014, Proceedings, Springer, 2014, p. 187-199Conference paper (Refereed)
    Abstract [en]

    The Fibonacci-to-Galois transformation is useful for reducing the propagation delay of feedback shift register-based stream ciphers and hash functions. In this paper, we extend it to handle Galois-to-Galois case as well as feedforward connections. This makes possible transforming Trivium stream cipher and increasing its keystream data rate by 27% without any penalty in area. The presented transformation might open new possibilities for cryptanalysis of Trivium, since it induces a class of stream ciphers which generate the same set of keystreams as Trivium, but have a different structure.

  • 118.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Composition Trees in Finding Best Variable Orderings for ROBDDs2002In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2002Conference paper (Refereed)
  • 119.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Finding Matching Initial States for Equivalent NLFSRs in the Fibonacci and the Galois Configurations2010In: IEEE Transactions on Information Theory, ISSN 0018-9448, E-ISSN 1557-9654, Vol. 56, no 6, p. 2961-2966Article in journal (Refereed)
    Abstract [en]

    The Fibonacci and the Galois configurations of nonlinear feedback shift registers (NLFSRs) are considered. In the former, the feedback is applied to the input bit of the shift register only. In the latter, the feedback can potentially be applied to every bit. The sufficient conditions for equivalence of NLFSRs in the Fibonacci and the Galois configurations have been formulated previously. The equivalent NLFSRs in different configurations normally have to be initialized to different states to generate the same output sequences. The mapping between the initial states of two equivalent NLFSRs in the Fibonacci and the Galois configurations is derived in this paper.

  • 120.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Multiple-Valued Logic Synthesis and Optimization2002In: Logic Synthesis and Verification / [ed] S. Hassoun and T. Sasao, Kluwer Academic Publishers, 2002, 1, p. 89-114Chapter in book (Refereed)
  • 121.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Synthesis of Binary Machines2011In: IEEE Transactions on Information Theory, ISSN 0018-9448, E-ISSN 1557-9654, Vol. 57, no 10, p. 6890-6893Article in journal (Refereed)
    Abstract [en]

    The problem of constructing a binary machine with the minimum number of stages generating a given binary sequence is addressed. Binary machines are a generalization of nonlinear feedback shift registers (NLFSRs) in which both connections, feedback and feedforward, are allowed and no chain connection between the register stages is required. An algorithm for constructing a shortest binary machine generating a given periodic binary sequence is presented.

  • 122.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Synthesis of parallel binary machines2011In: Computer-Aided Design (ICCAD), 2011 IEEE/ACM International Conference on, 2011, p. 200-206Conference paper (Refereed)
    Abstract [en]

    Binary machines are a generalization of Feedback Shift Registers (FSRs) in which both, feedback and feedforward, connections are allowed and no chain connection between the register stages is required. In this paper, we present an algorithm for synthesis of binary machines with the minimum number of stages for a given degree of parallelization. Our experimental results show that for sequences with high linear complexity such as complementary, Legendre, or truly random, parallel binary machines are an order of magnitude smaller than parallel FSRs generating the same sequence. The presented approach can potentially be of advantage for many applications including wireless communication, cryptography, and testing.

  • 123.
    Dubrova, Elena
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Upper Bound on the Number of Products in a Sum-of-Product Expansion of Multiple-Valued Functions2000In: Multiple-Valued Logic, An International Journal, Vol. 5, p. 349-364Article in journal (Refereed)
  • 124.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Ellervee, P.
    Muzio, J.
    Miller, M.
    TOP: An Algorithm for Three-Level Optimization of PLDs2000In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition, 2000Conference paper (Refereed)
  • 125.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Farm, P.
    A Conjunctive Canonical Expansion of Multiple-Valued Functions2002In: Proceedings of 32nd IEEE International Symposium on Multiple-Valued Logic, 2002Conference paper (Refereed)
  • 126.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Liu, Ming
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Teslenko, Maxim
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Finding Attractors in Synchronous Multiple-Valued Networks Using SAT-based Bounded Model Checking2012In: Journal of Multiple-Valued Logic and Soft Computing, ISSN 1542-3980, E-ISSN 1542-3999, Vol. 19, no 1-3, p. 109-131Article in journal (Refereed)
    Abstract [en]

    Synchronous multiple-valued networks are a discrete-space discrete-time model of the gene regulatory network of living cells. In this model, cell types are represented by the cycles in the state transition graph of a network, called attractors. When the effect of a disease or a mutation on a cell is studied, attractors have to be re-computed each time a fault is injected in the model. This motivates research on algorithms for finding attractors. Existing decision diagram-based approaches have limited capacity due to the excessive memory requirements of decision diagrams. Simulation-based approaches can be applied to larger networks, however, they are incomplete. We present an algorithm for finding attractors which uses a SAT-based bounded model checking. Our model checking approach exploits the deterministic nature of the network model to reduce runtime. Although the idea of applying model checking to the analysis of gene regulatory networks is not new, to our best knowledge, we are the first to use it for computing all attractors in a model. The efficiency of the presented algorithm is evaluated by analyzing 7 networks models of real biological processes as well as 35.000 randomly generated 4-valued networks. The results show that our approach has a potential to handle an order of magnitude larger models than currently possible.

  • 127.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Macchiarulo, L.
    A Comment on Graph-Based Algorithm for Boolean Function Manipulation2000In: IEEE Transactions on Computers, p. 1290-1292Article in journal (Refereed)
  • 128.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Mansouri, Shohreh
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A BDD-Based Method for LFSR Parellelization with Application to Fast CRC Encoding2013In: Journal of Multiple-Valued Logic and Soft Computing, ISSN 1542-3980, E-ISSN 1542-3999, Vol. 21, no 5, p. 561-575Article in journal (Refereed)
    Abstract [en]

    Galois Fields of order $2^k$, $GF(2^k)$, provide a unified theoretical framework for constructing parallel devices generating $k$ output bits per clock cycle. In this paper, we use $GF(2^k)$ for constructing Linear Feedback Shift Registers (LFSRs) for the parallel encoding of Cyclic Redundancy Check (CRC) codes.CRC codes are widely used in data communication and storage for detecting burst errors. Traditional methods for the parallel encoding of CRC are based on computing the $k$th power of the connection matrix of the LFSR. We propose an alternative method based on computing the $k$th power of the transition relation of the LFSR. We use Binary Decision Diagrams (BDDs) for representing the transition relation in a partitioned form. This allows us to bound the size of BDDs by $O(n^2)$, where $n$ is the size of the LFSR. The presented algorithm is asymptotically faster than previous algorithms for LFSR parallelization.

  • 129.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Sack, H.
    Probabilistic Verification of Multiple-Valued Functions2000In: Proceedings of the 30th IEEE International Symposium on Multiple-Valued Logic, 2000, p. 460-467Conference paper (Refereed)
  • 130.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Sarif Mansouri, Shohreh
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A BDD-based approach to constructing LFSRs for parallel CRC encoding2012In: Proceedings, IEEE 42nd International Symposium on Multiple-Valued Logic. ISMVL 2012, IEEE Computer Society, 2012, p. 128-133Conference paper (Refereed)
    Abstract [en]

    Cyclic Redundancy Check codes (CRC) are widely used in data communication and storage devices for detecting burst errors. In applications requiring high-speed data transmission, multiple bits of an CRC are computed in parallel. Traditional methods for constructing an Linear Feedback Shift Register (LFSR) generating k bits of an CRC in parallel are based on computing kth power of the connection matrix of the LFSR. We propose an alternative method which is based on computing kth power of the transition relation of the LFSR. We use Binary Decision Diagrams (BDDs) for representing the transition relation and we keep the transition relation partitioned. This allows us to bound the size of BDDs by O(n(2)), where n is the size of the LFSR. Our experimental results show that the presented algorithm asymptotically improves the complexity of previous approaches.

  • 131.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Sharif Mansouri, Shohreh
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A BDD-Based Method for LFSR Parallelization with Application to Fast CRC Encoding2013In: Journal of Multiple-Valued Logic and Soft Computing, ISSN 1542-3980, E-ISSN 1542-3999, Vol. 21, no 5-6, p. 561-574Article in journal (Refereed)
    Abstract [en]

    Galois Fields of order 2(k), GF(2(k)), provide a unified theoretical framework for constructing parallel devices generating k output bits per clock cycle. In this paper, we use GF(2(k)) for constructing Linear Feedback Shift Registers (LFSRs) for the parallel encoding of Cyclic Redundancy Check (CRC) codes. CRC codes are widely used in data communication and storage for detecting burst errors. Traditional methods for the parallel encoding of CRC are based on computing the kth power of the connection matrix of the LFSR. We propose an alternative method based on computing the kth power of the transition relation of the LFSR. We use Binary Decision Diagrams (BDDs) for representing the transition relation in a partitioned form. This allows us to bound the size of BDDs by O(n(2)), where it is the size of the LFSR. The presented algorithm is asymptotically faster than previous algorithms for LFSR parallelization.

  • 132.
    Dubrova, Elena
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Teslenko, Maxim
    Ming, Liu
    Finding Attractors in Synchronous Multiple-Valued Networks Using SAT-based Bounded Model Checking2010In: 40TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC ISMVL 2010, Los Alamitos: IEEE COMPUTER SOC , 2010, p. 144-149Conference paper (Refereed)
    Abstract [en]

    Synchronous multiple-valued networks are a discrete-space discrete-time model of the gene regulatory network of living cells. In this model, cell types are represented by the cycles in the state transition graph of a network, called attractors. When the effect of a disease or a mutation on a cell is studied, attractors have to be re-computed each time a fault is injected in the model. This motivates research on algorithms for finding attractors. Existing decision diagram-based approaches have limited capacity due to the excessive memory requirements of decision diagrams. Simulation-based approaches can be applied to larger networks, however, they are incomplete. We present an algorithm for finding attractors which uses a SAT-based bounded model checking. Our model checking approach exploits the deterministic nature of the network model to reduce runtime. Although the idea of applying model checking to the analysis of gene regulatory networks is not new, to our best knowledge, we are the first to use it for computing all attractors in a model. The efficiency of the presented algorithm is evaluated by analyzing 7 networks models of real biological processes as well as 35.000 randomly generated 4-valued networks. The results show that our approach has a potential to handle an order of magnitude larger models than currently possible.

  • 133.
    Ebrahimi, M.
    et al.
    Turku Centre for Computer Science (TUCS).
    Daneshtalab, M.
    Turku Centre for Computer Science (TUCS).
    Liljeberg, P.
    Turku Centre for Computer Science (TUCS).
    Plosila, J.
    Turku Centre for Computer Science (TUCS).
    Flich, J.
    Turku Centre for Computer Science (TUCS).
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems. KTH, School of Information and Communication Technology (ICT), Centres, VinnExcellence Center for Intelligence in Paper and Packaging, iPACK.
    Path-based Partitioning Methods for 3D Networks-on-Chip with Minimal Adaptive Routing2012In: I.E.E.E. transactions on computers (Print), ISSN 0018-9340, E-ISSN 1557-9956, Vol. 99, p. 1-16Article in journal (Refereed)
    Abstract [en]

    Combining the benefits of 3D ICs and Networks-on-Chip (NoCs) schemes provides a significant performance gain for 3D architectures. Since multicast communication is commonly used in cache coherence protocols for CMPs and in various parallel applications, the performance in these systems can be significantly improved if multicast operations are supported at hardware level. In this paper, we present several partitioning methods for the path-based multicast approach in 3D mesh-based NoCs, each with different levels of efficiency. In addition, we develop novel analytical models for unicast and multicast traffic to explore the efficiency of each approach. In order to distribute the unicast and multicast traffic more efficiently over the network, we propose Minimal Adaptive Routing (MAR) algorithm for the presented partitioning methods. The analytical and experimental results show that an advantageous method named Recursive Partitioning (RP) outperforms the other approaches. RP recursively partitions the network until all partitions contain a comparable number of switches and the multicast traffic is equally distributed among several subsets. The simulation results reveal that the RP method can achieve performance improvement across all workloads while the performance can be further improved by utilizing MAR, 19% average and 42% maximum latency reduction, on SPLASH-2 and PARSEC benchmarks.

  • 134. Ebrahimi, M.
    et al.
    Daneshtalab, M.
    Liljeberg, P.
    Plosila, J.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A High-Performance Network Interface Architecture for NoCs Using Reorder Buffer Sharing2010In: 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2010, 2010, p. 546-550Conference paper (Refereed)
    Abstract [en]

    Increasing memory parallelism in MPSoCs to provide higher memory bandwidth is achieved by accessing multiple memories simultaneously. Inasmuch as the response transactions of concurrent memory accesses must be in-order, a reordering mechanism is required. To our knowledge the resource utilization of conventional reordering mechanisms is low. In this paper, we present a novel network interface architecture for on-chip networks to increase the resource utilization and to improve overall performance. Also, based on the proposed architecture, a hybrid network interface is presented to integrate both memory and processor in a tile. The proposed architecture exploits AXI transaction based protocol to be compatible with existing IP cores. Experimental results with synthetic test cases demonstrate that the proposed architecture outperforms the conventional architecture in terms of latency. Also, the cost of the presented architecture is evaluated with UMC 0.09μm technology.

  • 135. Ebrahimi, M.
    et al.
    Daneshtalab, M.
    Liljeberg, P.
    Plosila, J.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Efficient congestion-aware selection method for on-chip networks2011In: 6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip, ReCoSoC 2011 - Proceedings, 2011Conference paper (Refereed)
    Abstract [en]

    The choice of routing algorithm can have a large impact on the performance of on-chip networks. As adaptive routing algorithms may return a set of output channels, a selection method (routing policy) is employed to choose the appropriate output channel from the given set. In this paper, we present a novel on-chip network structure to detect the local and non-local congested areas. Based on the presented structure, an efficient congestion-aware selection method is proposed to choose an output channel that allows a packet to be routed through a less congested area.

  • 136. Ebrahimi, M.
    et al.
    Daneshtalab, M.
    Liljeberg, P.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Partitioning methods for unicast/multicast traffic in 3D NoC architecture2010In: Proceedings of the 13th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems, DDECS 2010, 2010, p. 127-132Conference paper (Refereed)
    Abstract [en]

    As the scale of integration grows, the interconnection problem becomes one of the major design considerations of Multi Processor System on Chip (MPSoC). In recent years, many researchers have conducted studies on 3D IC designs stacking multiple layers on top of each other. In order to decrease the transmission delay of unicast/multicast messages in a network based multicore system, the network is divided into several partitions. In this paper, we first introduce a novel idea of balanced partitioning that allows the network to be partitioned effectively. Then, we propose a set of partitioning approaches each with a different level of efficiency. In addition, we present an advantageous method based on the idea of balanced partitioning to provide a high degree of parallelism with a considerable reduction of packet delay in unicast/multicast traffic. Simulations are provided to evaluate and compare the performance of proposed methods.

  • 137. Ebrahimi, M.
    et al.
    Daneshtalab, M.
    Liljeberg, P.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Performance Analysis of 3D NoCs Partitioning Methods2010In: IEEE Annual Symposium on VLSI, ISVLSI 2010, 2010, p. 479-480Conference paper (Refereed)
    Abstract [en]

    3D IC design improves performance and decreases power consumption by replacing long horizontal interconnects with short vertical ones. Achieving higher performance along with reducing the network latency can be obtained by utilizing an efficient communication protocol in 3D Networks-on-Chlp (NoCs). In this work, several unlcast/multicast partitioning methods are explained in order to And an advantageous method with low communication latency. Moreover, two factors of efficiency, unicast latency and multicast latency, are analyzed by analytical models. We also perform simulation to compare the efficiency of proposed methods. The results show that Mixed Partitioning method outperforms other methods in term of latency.

  • 138.
    Ebrahimi, Masoumeh
    et al.
    KTH, School of Information and Communication Technology (ICT), Industrial and Medical Electronics. University of Turku, Finland .
    Wang, J.
    Huang, L.
    Daneshtalab, Masoud
    KTH, School of Information and Communication Technology (ICT), Electronics and Embedded Systems. University of Turku, Finland .
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Rescuing healthy cores against disabled routers2014Conference paper (Refereed)
    Abstract [en]

    A router may be temporarily or permanently disabled in NoCs for several reasons such as saving power, occurring faults or testing. Disabling a router, however, may have a severe impact on the performance or functionality of the entire system if it results in disconnecting the core from the network. In this paper, we propose a deadlock-free routing algorithm which allows the core to stay connected to the system and continue its normal operation when its connected router is disabled. Our analysis and experiments show that the proposed technique has 100%, 93.60%, and 87.19% network availability by 100% packet delivery when 1, 2 and 3 routers are defunct or intentionally disabled. The algorithm provides adaptivity and it is lightweight, requiring one and two virtual channels along the X and Y dimension, respectively.

  • 139.
    Ejaz, Ahsen
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Costs and benefits of flexibility in spatial division Circuit Switched Networks-on-Chip2013In: NoCArc '13 Proceedings of the Sixth International Workshop on Network on Chip Architectures, Association for Computing Machinery (ACM), 2013, p. 41-46Conference paper (Refereed)
    Abstract [en]

    Although most Network-on-Chip (NoC) designs are based on Packet Switching (PS), the importance of Circuit Switching (CS) should not be underestimated. Many MPSoC executing real-time applications require an underlying communication backbone that can relay messages from one node to another with guaranteed throughput. Compared to PS, CS can provide guaranteed throughput with lower area and power overheads. It is also highly suited for applications where nodes transfer long messages. Spatial Division Multiplexing (SDM) can allow more efficient use of available network resources by dividing them among multiple simultaneous transactions. The network developed by Vali [1] has three design variations based on the number of sub-channels, has a predictable connection setup time, and uses CS to provide guaranteed throughput once a connection is established. In this paper we use this network as a basis to study the effect of flexibility based on SDM, on the performance of a CS networks. A network evaluation platform has been developed to configure and evaluate networks with a maximum of 8 sub-networks, with each sub-network comprising of 1, 2 or 4 sub-channels. We show that under uniform traffic pattern with requests of uniform random bandwidth (BW) requirement, a less flexible network outperforms a network with higher flexibility due to a phenomenon we call 'stray requests'. We conclude this paper by showing that under high network traffic, performance of our flexible networks can be as much as 113% better than HAGAR [2] and Liu's [3] network. Co

  • 140. Ellervee, Peeter
    et al.
    Kumar, Shashi
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Hemani, Ahmed
    Svantesson, Bengt
    Öberg, Johnny
    Sander, Ingo
    IRSYD - An Internal Representation for System Description. Version 0.11997Report (Other academic)
  • 141. Ellervee, Peeter
    et al.
    Kumar, Shashi
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Svantesson, Bengt
    Meincke, Thomas
    Hemani, Ahmed
    IRSYD: An Internal Representation for Heterogeneous Embedded Systems1998In: Proceedings of the 16th NORCHIP Conference, 1998Conference paper (Refereed)
  • 142.
    Ellervee, Peeter
    et al.
    KTH, School of Information and Communication Technology (ICT).
    Miranda, Miguel
    IMEC.
    Catthoor, Francky
    IMEC, Katholieke Universiteit.
    Hemani, Ahmed
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Exploiting data transfer locality in memory mapping1999In: EUROMICRO Conference, 1999. Proceedings. 25th, 1999, Vol. 1, p. 14-21Conference paper (Refereed)
    Abstract [en]

    System-level exploration of memory architectures is one of the key issues in successful implementation of data-transfer dominated applications. Usually, one of the main design bottlenecks is the memory access bandwidth. Transformations, rearranging the layout of the data records stored in memory, are very effective to improve the locality of the data transfers but usually lead to a large memory bit-wastage when not performed carefully. In this paper, a methodology which reduces memory bandwidth requirements without sacrificing storage space is proposed. The methodology exploits parallelism in the data-transfers to rearrange the layout of the data records. Distributed memory organization combined with our proposed layout rearrangement methodology allow to effectively reduce the memory bandwidth bottleneck in data-transfer dominated applications

  • 143. Ellervee, Peeter
    et al.
    Öberg, Johnny
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Hemani, Ahmed
    Neural Network Based Estimator to Explore the Design Space at System Level1994In: Procceedings of the Biennial Baltic Electronic Conference, Tallin, 1994Conference paper (Refereed)
  • 144.
    Eslami Kiasari, Abbas
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Performance Analysis and Design Space Exploration of On-Chip Interconnection Networks2013Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The advance of semiconductor technology, which has led to more than one billion transistors on a single chip, has enabled designers to integrate dozens of IP (intellectual property) blocks together with large amounts of embedded memory. These advances, along with the fact that traditional communication architectures do not scale well have led to significant changes in the architecture and design of integrated circuits. One solution to these problems is to implement such a complex system using an on-chip interconnection network or network-on-chip (NoC). The multiple concurrent connections of such networks mean that they have extremely high bandwidth. Regularity can lead to design modularity providing a standard interface for easier component reuse and improved interoperability.

    The present thesis addresses the performance analysis and design space exploration of NoCs using analytical and simulation-based performance analysis approaches. At first, we developed a simulator aimed to performance analysis of interconnection networks. The simulator is then used to evaluate the performance of networks topologies and routing algorithms since their choice heavily affect the performance of NoCs. Then, we surveyed popular mathematical formalisms – queueing theory, network calculus, schedulability analysis, and dataflow analysis – and how they have been applied to the analysis of on-chip communication performance in NoCs. We also addressed research problems related to modelling and design space exploration of NoCs.

    In the next step, analytical router models were developed that analyse NoC performance. In addition to providing aggregate performance metrics such as latency and throughput, our approach also provides feedback about the network characteristics at a fine-level of granularity. Our approach explicates the impact that various design parameters have on the performance, thereby providing invaluable insight into NoC design. This makes it possible to use the proposed models as a powerful design and optimisation tool.

    We then used the proposed analytical models to address the design space exploration and optimisation problem. System-level frameworks to address the application mapping and to design routing algorithms for NoCs were presented. We first formulated an optimisation problem of minimizing average packet latency in the network, and then solved this problem using the simulated annealing heuristic. The proposed framework can also address other design space exploration problems such as topology selection and buffer dimensioning.

  • 145.
    Eslami Kiasari, Abbas
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Bekooij, M.
    Burns, A.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Analytical approaches for performance evaluation of networks-on-chip2012In: CASES'12 - Proceedings of the 2012 ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems, Co-located with ESWEEK, ACM , 2012, p. 211-212Conference paper (Refereed)
    Abstract [en]

    This tutorial reviews four popular mathematical formalisms - dataflow analysis, schedulability analysis, network calculus, and queueing theory - and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.

  • 146.
    Eslami Kiasari, Abbas
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Framework for Designing Congestion-Aware Deterministic Routing2010In: NoCArc '10 Proceedings of the Third International Workshop on Network on Chip Architectures, 2010, p. 45-50Conference paper (Refereed)
    Abstract [en]

    In this paper, we present a system-level Congestion-Aware Routing (CAR) framework for designing minimal deterministic routing algorithms. CAR exploits the peculiarities of the application workload to spread the load evenly across the network. To this end, we first formulate an optimization problem of minimizing the level of congestion in the network and then use the simulated annealing heuristic to solve this problem. The proposed framework assures deadlock-free routing, even in the networks without virtual channels. Experiments with both synthetic and realistic workloads show the effectiveness of the CAR framework. Results show that maximum sustainable throughput of the network is improved by up to 205% for different applications and architectures.

  • 147.
    Eslami Kiasari, Abbas
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs2013In: Algorithms in Networks-on-Chip, Springer, 2013, p. 21-39Chapter in book (Refereed)
    Abstract [en]

    In this chapter, we present a system-level framework for designing minimal deterministic routing algorithms for Networks-on-Chip (NoCs) that are customized for a set of applications. To this end, we first formulate an optimization problem of minimizing average packet latency in the network and then use the simulated annealing heuristic to solve this problem. To estimate the average packet latency we use a queueing-based analytical model which can capture the burstiness of the traffic. The proposed framework does not require virtual channels to guarantee deadlock freedom since routes are extracted from an acyclic channel dependency graph. Experiments with both synthetic and realistic workloads show the effectiveness of the approach. Results show that maximum sustainable throughput of the network is improved for different applications and architectures.

  • 148.
    Eslami Kiasari, Abbas
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Mathematical formalisms for performance evaluation of networks-on-chip2013In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 45, no 3, p. 38-Article in journal (Refereed)
    Abstract [en]

    This article reviews four popular mathematical formalisms-queueing theory, network calculus, schedulability analysis, anddataflow analysis-and how they have been applied to the analysis of on-chip communication performance in Systems-on-Chip. The article discusses the basic concepts and results of each formalism and provides examples of how they have been used in Networks-on-Chip (NoCs) performance analysis. Also, the respective strengths and weaknesses of each technique and its suitability for a specific purpose are investigated. An open research issue is a unified analytical model for a comprehensive performance evaluation of NoCs. To this end, this article reviews the attempts that have been made to bridge these formalisms.

  • 149.
    Eslami Kiasari, Abbas
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    An Analytical Latency Model for Networks-on-Chip2013In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 21, no 1, p. 113-123Article in journal (Refereed)
    Abstract [en]

    We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.

  • 150.
    Ezzeddine, Hussein
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Öberg, Johnny
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Robino, Francesco
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Validation of Pipelined Double-precision Floating Point operations in a multi-core environment implemented on FPGA using the ForSyDe/NoC system generator tool suite2015In: NORCHIP 2014 - 32nd NORCHIP Conference: The Nordic Microelectronics Event, 2015Conference paper (Refereed)
    Abstract [en]

    Testing HW IP Blocks in multi-core environments is difficult. This paper presents a case study where a SINE/COSINE implementation using Pipelined Double-precision operations is implemented in one node, and results are sent through the NoC to a target node for inspection. The purpose of the experiments are two-fold, a) to study how debugging in a multi-core environment can be done and b) to examine why the original SINE/COSINE implementation is generating wrong results. During the experiments, several test-methods are applied to validate the implementations until the Floating Point implementation are generating correct values. After eliminating all faults in the operations, the SINE/COSINE function still generates some residual algorithmic errors, coming from the way the function was implemented. However, the experiments show that these errors can be eliminated with the help of some simple trigonometric rales.

1234567 101 - 150 of 633
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf