Ändra sökning
Avgränsa sökresultatet
12 1 - 50 av 91
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Adam, Constantin
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    A Middleware Design for Large-scale Clusters offering Multiple Services2006Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 3, nr 1, s. 1-12Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a decentralized design that dynamically allocates resources to multiple services inside a global server cluster. The design supports QoS objectives (maximum response time and maximum loss rate) for each service. A system administrator can modify policies that assign relative importance to services and, in this way, control the resource allocation process. Distinctive features of our design are the use of an epidemic protocol to disseminate state and control information, as well as the decentralized evaluation of utility functions to control resource partitioning among services. Simulation results show that the system operates both effectively and efficiently; it meets the QoS objectives and dynamically adapts to load changes and to failures. In case of overload, the service quality degrades gracefully, controlled by the cluster policies.

  • 2.
    Adam, Constantin
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Adaptable Server Clusters with QoS Objectives2005Ingår i: Integrated Network Management IX - MANAGING NEW NETWORKED WORLDS / [ed] Clemm A, Festor O, Pras A, New York: IEEE , 2005, s. 149-163Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a decentralized design for a server cluster that supports a single service with response time guarantees. Three distributed mechanisms represent the key elements of our design. Topology construction maintains a dynamic overlay of cluster nodes. Request routing directs service requests towards available servers. Membership control allocates/releases servers to/from the cluster, in response to changes in the external load. We advocate a decentralized approach, because it is scalable, fault-tolerant, and has a lower configuration complexity than a centralized solution. We demonstrate through simulations that our system operates efficiently by comparing it to an ideal centralized system. In addition, we show that our system rapidly adapts to changing load. We found that the interaction of the various mechanisms in the system leads to desirable global properties. More precisely, for a fixed connectivity c (i.e., the number of neighbors of a node in the overlay), the average experienced delay in the cluster is independent of the external load. In addition, increasing c increases the average delay but decreases the system size for a given load. Consequently, the cluster administrator can use c as a management parameter that permits control of the tradeoff between a small system size and a small experienced delay for the service.

  • 3.
    Adam, Constantin
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Externally Controllable, Self-Oganizing Server Clusters2005Ingår i: Designing a Scalable, Self-organizing Middleware for Server Clusters (NGNM05): in the scope of Networking 2005, 2005, s. 1-12Kapitel i bok, del av antologi (Övrigt vetenskapligt)
  • 4.
    Adam, Constantin
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Implementation and evaluation of a middleware for self-organizing decentralized web services2006Ingår i: Integrated Network Management IX: MANAGING NEW NETWORKED WORLDS, 2006, Vol. 3996, s. 1-14Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present the implementation of Chameleon, a peer-to-peer middleware for self-organizing web services, and we provide evaluation results from a test bed. The novel aspect of Chameleon is that key functions, including resource allocation, are decentralized, which facilitates scalability and robustness of the overall system. Chameleon is implemented in Java on the Tomcat web server environment. The implementation is non-intrusive in the sense that it does not require code modifications in Tomcat or in the underlying operating system. We evaluate the system by running the TPC-W benchmark. We show that the middleware dynamically and effectively reconfigures in response to changes in load patterns and server failures, while enforcing operating policies, namely, QoS objectives and service differentiation under overload.

  • 5.
    Adam, Constantin
    et al.
    KTH, Tidigare Institutioner                               , Mikroelektronik och informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Tidigare Institutioner                               , Mikroelektronik och informationsteknik, IMIT.
    Patterns for Routing and Self-Stabilization2004Ingår i: NOMS 2004: IEEE/IFIP NETWORK OPERATIONS AND MANAGMENT SYMPOSIUM - MANAGING NEXT GENERATION CONVERGENCE NETWORKS AND SERVICES, New York: IEEE , 2004, s. 61-74Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper contributes towards engineering self-stabilizing networks and Services. We propose the use of navigation patterns, which define how information for state updates is disseminated in the system, as fundamental building blocks for self-stabilizing systems. We present two navigation patterns for self-stabilization: the progaressive wave pattern and the stationary wave pattern. The progressive wave pattern defines the update dissemination in Internet routing systems running the DUAL and OSPF protocols. Similarly, the stationary wave pattern defines the interactions of peer nodes in structured-peer-to-peer systems, including Chord, Pastry, Tapestry, and CAN. It turns out that both patterns are related. They both disseminate information in form of waves, i.e, sets of messages that originate from single events. Patterns can be instrumented to obtain wave statistics, which enables monitoring the process of self-stabilization in a system. We focus on Internet routing and peer-to-peer systems in this work, since we believe that studying these (existing) systems can lead to engineering principles for self-stabilizing system in various application areas.

  • 6.
    Adam, Constantin
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Service middleware for self-managing large-scale systems2007Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 4, nr 3, s. 50-64Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Resource management poses particular challenges in large-scale systems, such as server clusters that simultaneously process requests from a large number of clients. A resource management scheme for such systems must scale both in the in the number of cluster nodes and the number of applications the cluster supports. Current solutions do not exhibit both of these properties at the same time. Many are centralized, which limits their scalability in terms of the number of nodes, or they are decentralized but rely on replicated directories, which also reduces their ability to scale. In this paper, we propose novel solutions to request routing and application placementtwo key mechanisms in a scalable resource management scheme. Our solution to request routing is based on selective update propagation, which ensures that the control load on a cluster node is independent of the system size. Application placement is approached in a decentralized manner, by using a distributed algorithm that maximizes resource utilization and allows for service differentiation under overload. The paper demonstrates how the above solutions can be integrated into an overall design for a peer-to-peer management middleware that exhibits properties of self-organization. Through complexity analysis and simulation, we show to which extent the system design is scalable. We have built a prototype using accepted technologies and have evaluated it using a standard benchmark. The testbed measurements show that the implementation, within the parameter range tested, operates efficiently, quickly adapts to a changing environment and allows for effective service differentiation by a system administrator.

  • 7.
    Adam, Constantin
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Tang, Chunqiang
    Steinder, Malgorzata
    Spreitzer, Michael
    A service middleware that scales in system size and applications2007Ingår i: 2007 10TH IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009): VOLS 1 AND 2, NEW YORK: IEEE , 2007, s. 70-79Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a peer-to-peer service management middleware that dynamically allocates system resources to a large set of applications. The system achieves scalability in number of nodes (1000s or more) through three decentralized mechanisms that run on different time scales. First, overlay construction interconnects all nodes in the system for exchanging control and state information. Second, request routing directs requests to nodes that offer the corresponding applications. Third, application placement controls the set of offered applications on each node, in order to achieve efficient operation and service differentiation. The design supports a large number of applications (100s or more) through selective propagation of configuration information needed for request routing. The control load on a node increases linearly with the number of applications in the system. Service differentiation is achieved through assigning a utility to each application which influences the application placement process. Simulation studies show that the system operates efficiently for different sizes, adapts fast to load changes and failures and effectively differentiates between different applications under overload.

  • 8. Ahmed, J.
    et al.
    Johnsson, A.
    Moradi, F.
    Pasquini, R.
    Flinta, C.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Online approach to performance fault localization for cloud and datacenter services2017Ingår i: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management, Institute of Electrical and Electronics Engineers Inc. , 2017, s. 873-874Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automated detection and diagnosis of the performance faults in cloud and datacenter environments is a crucial task to maintain smooth operation of different services and minimize downtime. We demonstrate an effective machine learning approach based on detecting metric correlation stability violations (CSV) for automated localization of performance faults for datacenter services running under dynamic load conditions. © 2017 IFIP.

  • 9. Ahmed, J.
    et al.
    Johnsson, A.
    Yanggratoke, Rerngvit
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Ardelius, J.
    Flinta, C.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Predicting SLA conformance for cluster-based services using distributed analytics2016Ingår i: Proceedings of the NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium, IEEE conference proceedings, 2016, s. 848-852Konferensbidrag (Refereegranskat)
    Abstract [en]

    Service assurance for the telecom cloud is a challenging task and is continuously being addressed by academics and industry. One promising approach is to utilize machine learning to predict service quality in order to take early mitigation actions. In previous work we have shown how to predict service-level metrics, such as frame rate for a video application on the client side, from operational data gathered at the server side. This gives the service provider early indications on whether the platform can support the current load demand. This paper extends previous work by addressing scalability issues for cluster-based services. Operational data being generated in large volumes, from several sources, and at high velocity puts strain on computational and communication resources. We propose and evaluate a distributed machine learning system based on the Winnow algorithm to tackle scalability issues, and then compare the new distributed solution with the previously proposed centralized solution. We show that network overhead and computational execution time is substantially reduced while maintaining high prediction accuracy making it possible to achieve real-time service quality predictions in large systems.

  • 10. Ahmed, J.
    et al.
    Josefsson, T.
    Johnsson, A.
    Flinta, C.
    Moradi, F.
    Pasquini, R.
    Stadler, Rolf
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Nätverk och systemteknik. KTH, Skolan för elektroteknik och datavetenskap (EECS), Centra, ACCESS Linnaeus Centre.
    Automated diagnostic of virtualized service performance degradation2018Ingår i: IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World, NOMS 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, s. 1-9Konferensbidrag (Refereegranskat)
    Abstract [en]

    Service assurance for cloud applications is a challenging task and is an active area of research for academia and industry. One promising approach is to utilize machine learning for service quality prediction and fault detection so that suitable mitigation actions can be executed. In our previous work, we have shown how to predict service-level metrics in real-time just from operational data gathered at the server side. This gives the service provider early indications on whether the platform can support the current load demand. This paper provides the logical next step where we extend our work by proposing an automated detection and diagnostic capability for the performance faults manifesting themselves in cloud and datacenter environments. This is a crucial task to maintain the smooth operation of running services and minimizing downtime. We demonstrate the effectiveness of our approach which exploits the interpretative capabilities of Self- Organizing Maps (SOMs) to automatically detect and localize different performance faults for cloud services. © 2018 IEEE.

  • 11. Baliosian, J.
    et al.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Decentralized configuration of neighboring cells for radio access networks2007Ingår i: 2007 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks, WOWMOM, IEEE , 2007, s. 4351740-Konferensbidrag (Refereegranskat)
    Abstract [en]

    In order to execute a handover processes in a Radio Access Network, each cell has a configured list of neighbors to which such handovers are made. Rapid re-configuration of the neigh-borhood list in response to network failures and other events is currently not possible. To address this problem, this paper suggests an autonomic approach for dynamically configuring neighboring cell lists and introduces a decentralized, three-layered framework. As a key element of this framework, a novel probabilistic protocol that detects and continuously tracks the coverage overlaps among cells is presented and evaluated. The protocol, called DOC, maintains a distributed graph of over-lapping cells. Due to using Bloom fillers and aggregation techniques, it exhibits a low traffic and computational overhead. A first series of simulation studies suggests that DOC is scalable with respect to network size and the number of terminals.

  • 12. Baliosian, Javier
    et al.
    Matusikova, Katarina
    Quinn, Karl
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Policy-based self-healing for radio access networks2008Ingår i: 2008 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, IEEE , 2008, s. 1007-1010Konferensbidrag (Refereegranskat)
    Abstract [en]

    Various centralized, distributed or cooperative management systems have been proposed to address the demands of wireless telecommunication networks. However, considering the size, complexity and heterogeneity that those networks will have in the future, current solutions either do not scale properly, or have no support for automation, or lack of the flexibility and simple control that operators will need for managing future networks in a cost-effective way. To address this problem, we designed Omega, a distributed and policy-based network management system that uses rich knowledge-modeling techniques to develop self-configuration capabilities. Omega also implements a novel conflict-resolution method that uses high-level goals and machine learning techniques to optimize its policy-based decisions. Using simulations, in this paper we show how Omega reduces the impact of a node crash on the overall availability of a radio access network by optimizing the lists of neighboring cells of the nodes in the vicinity.

  • 13.
    Baliosian, Javier
    et al.
    Ericsson Ireland Research Centre, Athlone, Ireland.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Method of Discovering Overlapping Cells2007Patent (Övrig (populärvetenskap, debatt, mm))
  • 14. Brunner, M
    et al.
    Galis, A
    Cheng, L
    Colas, J A
    Ahlgren, B
    Gunnar, A
    Abrahamsson, H
    Szabo, R
    Csaba, S
    Nielsen, J
    Prieto, Alberto Gonzalez
    KTH, Tidigare Institutioner, Mikroelektronik och informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Tidigare Institutioner, Mikroelektronik och informationsteknik, IMIT.
    Molnar, G
    Ambient networks management challenges and approaches2004Ingår i: MOBILITY AWARE TECHNOLOGIES AND APPLICATIONS, PROCEEDINGS / [ed] Karmouch, A; Korba, L; Madeira, ERM, BERLIN: SPRINGER , 2004, Vol. 3284, s. 196-216Konferensbidrag (Refereegranskat)
    Abstract [en]

    System management addresses the provision of functions required for controlling, planning, allocating, monitoring, and deploying the resources of a network and of its services in order to optimize its efficiency and productivity and to safeguard its operation. It is also an enabler for the creation and sustenance of new business models and value chains, reflecting the different roles the service providers and users of a network can assume. Ambient Network represents a new networking approach and it aims to enable the cooperation of heterogeneous networks, on demand and transparently, to the potential users, without the need for pre-configuration or offline negotiation between network operators. To achieve these goals, ambient network management systems have to become dynamic, adaptive, autonomic and responsive to the network and its ambience. This paper discusses relationships between the concepts of autonomous and self-manageability and those of ambient networking, and the challenges and benefits that arise from their employment.

  • 15. Brunner, M
    et al.
    Galis, A
    Cheng, L
    Colas, J A
    Ahlgren, B
    Gunnar, A
    Abrahamsson, H
    Szabo, R
    Csaba, S
    Nielsen, J
    Schuetz, S
    Gonzalez Prieto, Alberto
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Molnar, G
    Towards Ambient Networks Management2005Ingår i: MOBILITY AWARE TECHNOLOGIES AND APPLICATIONS, PROCEEDINGS, 2005, Vol. 3744, s. 215-229Konferensbidrag (Refereegranskat)
    Abstract [en]

    Ambient Networks (AN) are under development and they are based on novel networking concepts and systems that will enable a wide range of user and business communication scenarios beyond today's fixed, 3(rd) generation mobile and IP standards. Central to this project is the concept of Ambient Control Space (ACS) and the Domain Manager control function, which manages the underlying data transfer capabilities and presents a set of interfaces towards the supported services and applications. Network Management Systems of Ambient Networks must work in an environment where heterogeneous networks compose and cooperate, on demand and transparently, without the need for manual (pre or re)-configuration or offline negotiations between network operators. To achieve these goals, ambient network management systems must become dynamic, distributed, self-managing and responsive to the network and its ambience. This paper describes the different management research challenges and four complementary solution approaches (i.e. Pattern-based Management, Peer-to-Peer Management, (Un)PnP Management, Traffic Engineering Management Application Approaches) that enable efficient management of ambient networks, and the relationships between them, and presents the main results achieved so far.

  • 16.
    Burgess, Mark
    et al.
    Oslo Univ Coll, Oslo, Norway..
    Disney, Matthew
    Oslo Univ Coll, Oslo, Norway..
    Stadler, Rolf
    KTH.
    Network patterns in cfengine and scalable data aggregation2007Ingår i: USENIX ASSOCIATION PROCEEDING OF THE 21ST LARGE INSTALLATION SYSTEMS ADMINISTRATION CONFERENCE, USENIX ASSOC , 2007, s. 275-+Konferensbidrag (Refereegranskat)
    Abstract [en]

    Network patterns are based on generic algorithms that execute on tree-based overlays. A set of such patterns has been developed at KTH to support distributed monitoring in networks with non-trivial topologies. We consider the use of this approach in logical peer networks in cfengine as a way of scaling aggregation of data to large organizations. Use of 'deep' network structures can lead to temporal anomalies. We show how to minimize temporal fragmentation during data aggregation by using time offsets and what effect these choices might have on power consumption. We offer proof of concept for this technology to initiate either multicast or inverse multicast pulses through sensor networks.

  • 17. Clemm, A.
    et al.
    Granville, L. Z.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Managing virtualization of networks and services2015Ingår i: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, Vol. 4785Artikel i tidskrift (Refereegranskat)
  • 18. Clemm, A.
    et al.
    Granville, L. Z.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Shaping the network management Research agenda-report on DSOM 20072008Ingår i: Journal of Network and Systems Management, ISSN 1064-7570, Vol. 16, nr 2, s. 223-225Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The 18th IFIP/IEEE international Workshop on Distributed Systems: Operations and Management (DSOM 2007) was held at San Jones, California/USA, form October 29-31, 2007. The aim of DSOM workshops is to bring together researchers from industry and academia in the areas of network, systems, and service management, in order to discuss recent advances and foster growth. The workshops have a single-track program in order to enable intense interaction among participants. DSOM 2007 continued its tradition of giving a platform to papers that address general topics related to the management of distributed systems. It included sessions on on decentralized and peer-to-peer management, fault detection and diagnosis, service accounting and auditing, problem detection and mitigation, and web services and management. DSOM 2008, will be held between September 22-26, 2008, at Samos Island, Greece. The theme will be Managing Large-scale Service Deployment, which takes up a key aspect of current research in network management.

  • 19.
    Dam, Mads
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    A Generic Protocol for Network State Aggregation2005Konferensbidrag (Refereegranskat)
    Abstract [en]

    Aggregation functions, which compute global parameters, such as the sum, minimum or average of local device variables, are needed for many network monitoring and management tasks. As networks grow larger and become more dynamic, it is crucial to compute these functions in a scalable and robust manner. To this end, we have developed GAP (Generic Aggregation Protocol), a novel protocol that computes aggregates of device variables for network management purposes. GAP supports continuous estimation of aggregates in a network where local state variables and the network graph may change. Aggregates are computed in a decentralized way using an aggregation tree. We have performed a functional evaluation of GAP in a simulation environment and have identied conguration choices that potentially allow us to control the performance characteristics of the protocol.

  • 20.
    Dan, Jurca
    et al.
    NTT DOCOMO Eurolabs in Munich, Germany.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES).
    H-GAP: Estimating Histograms of Local Variables with Accuracy Objectives for Distributed Real-Time Monitoring2010Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 7, nr 2, s. 83-95Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present H-GAP, a protocol for continuous monitoring,which provides a management station with the valuedistribution of local variables across the network. The protocolestimates the histogram of local state variables for a givenaccuracy and with minimal overhead. H-GAP is decentralizedand asynchronous to achieve robustness and scalability, and itexecutes on an overlay interconnecting management processesin network devices. On this overlay, the protocol maintains aspanning tree and updates the histogram through incrementalaggregation. The protocol is tunable in the sense that it allowscontrolling, at runtime, the trade-off between protocol overheadand an accuracy objective. This functionality is realized throughdynamic configuration of local filters that control the flow ofupdates towards the management station. The paper includes ananalysis of the problem of histogram aggregation over aggregationtrees, a formulation of the global optimization problem, anda distributed solution containing heuristic, tree-based algorithms.Using SUM as an example, we show how general aggregationfunctions over local variables can be efficiently computed withH-GAP. We evaluate our protocol through simulation using realtraces. The results demonstrate the controllability of H-GAP ina selection of scenarios and its efficiency in large-scale networks.

  • 21.
    Fetahi, Wuhib
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Dam, Mads
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Alexander, Clemm
    Cisco Systems, San Jose, CA USA.
    Robust Monitoring of Network-wide Aggregates through Gossiping2009Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 6, nr 2, s. 95-109Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We investigate the use of gossip protocols for continuousmonitoring of network-wide aggregates under crash failures.Aggregates are computed from local management variablesusing functions such as SUM, MAX, or AVERAGE. For this typeof aggregation, crash failures offer a particular challenge dueto the problem of mass loss, namely, how to correctly accountfor contributions from nodes that have failed. In this paper wegive a partial solution. We present G-GAP, a gossip protocolfor continuous monitoring of aggregates, which is robust againstfailures that are discontiguous in the sense that neighboringnodes do not fail within a short period of each other. We giveformal proofs of correctness and convergence, and we evaluatethe protocol through simulation using real traces. The simulationresults suggest that the design goals for this protocol have beenmet. For instance, the tradeoff between estimation accuracyand protocol overhead can be controlled, and a high estimationaccuracy (below some 5% error in our measurements) is achievedby the protocol, even for large networks and frequent nodefailures. Further, we perform a comparative assessment of GGAPagainst a tree-based aggregation protocol using simulation.Surprisingly, we find that the tree-based aggregation protocolconsistently outperforms the gossip protocol for comparativeoverhead, both in terms of accuracy and robustness.

  • 22. Flinta, C.
    et al.
    Johnsson, A.
    Ahmed, J.
    Moradi, F.
    Pasquini, R.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Real-time resource prediction engine for cloud management2017Ingår i: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management, Institute of Electrical and Electronics Engineers Inc. , 2017, s. 877-878Konferensbidrag (Refereegranskat)
    Abstract [en]

    Predicting resource requirements for cloud services is critical for dimensioning, anomaly detection and service assurance. We demonstrate a system for real-time estimation of the needed amount of infrastructure resources, such as CPU and memory, for a given service. Statistical learning methods on server statistics and load parameters of the service are used for learning a resource prediction model. The model can be used as a guideline for service deployment and for real-time identification of resource bottlenecks. © 2017 IFIP.

  • 23.
    Gonzales Prieto, Alberto
    et al.
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Stadler, Rolf
    KTH, Skolan för informations- och kommunikationsteknik (ICT), Mikroelektronik och Informationsteknik, IMIT.
    Design and Implementation of Performance Policies for SMS Systems2005Ingår i: AMBIENT NETWORKS, Berlin: Springer Verlag , 2005, s. 169-180Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a design for policy-based performance management of SMS Systems. The design takes as input the operator's performance goals, which are expressed as policies that can be adjusted at run-time. In our specific design, an SMS administrator can specify the maximum delay for a message and the maximum percentage of messages that can be postponed during periods of congestion. The system attempts to maximize the overall throughput while adhering to the performance policies. It does so by periodically solving a linear optimization problem that takes as input the policies and traffic statistics and computes a new configuration. We show that the computational cost for solving this problem is low, even for large system configurations. We have evaluated the design through extensive simulations in various scenarios. It has proved effective in achieving the administrator's performance goals and fast in adapting to changing network conditions. A prototype has been developed on a commercial SMS platform, which proves the validity of our design.

  • 24.
    Gonzales Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES).
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES).
    Distributed real-time monitoring with accuracy objectives2006Ingår i: NETWORKING 2006: NETWORKING TECHNOLOGIES, SERVICES, AND PROTOCOLS; PERFORMANCE OF COMPUTER AND COMMUNICATION NETWORKS; MOBILE AND WIRELESS COMMUNICATIONS SYSTEMS    / [ed] Boavida F, Plagemann T, Stiller B, Westphal C, Monteiro E, Berlin: Springer Verlag , 2006, s. 1246-1251Konferensbidrag (Refereegranskat)
    Abstract [en]

    We introduce A-GAP, a protocol for continuous monitoring of network state variables with configurable accuracy. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. In A-GAP, the accuracy is expressed in terms of the average error and is controlled by dynamically configuring filters in the management nodes. The protocol follows the push approach to monitoring and uses the concept of incremental aggregation on a self-stabilizing spanning tree. A-GAP is decentralized and asynchronous to achieve robustness and scalability. We provide some results from evaluating the protocol for an ISP topology (Abovenet) in several scenarios through simulation. The results show that we can effectively control the fundamental trade-off between accuracy and overhead. The protocol overhead can be reduced significantly by allowing only small error objectives.

  • 25.
    Gonzales Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES).
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES).
    Scalable policy distribution for ambient networks2005Ingår i: Proceedings of the 14th IST Mobile and Wireless Communication Summit, 2005Konferensbidrag (Refereegranskat)
    Abstract [en]

     The characteristics of policy-based management make it an interesting candidate for managing Ambient Networks, which are characterized for being highly dynamic and heterogeneous. However, current policy-based approaches are not scalable, which is a must for such dynamic scenarios. A key aspect for developing scalable systems is policy distribution, the mechanism that provides the right policies at the right locations in the network when they are needed. In this paper, we present a scalable framework for policy distribution for Ambient Networks. The framework is based on aggregating the addresses of the policies and applying multipoint communication techniques. The aggregation is based on grouping the managed elements by the role they play in the network and distributing policies that apply to all the elements in a group. We show the validity of the framework by applying it to a study case.

  • 26.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Dudkowski, Dominique
    Meirosu, Catalin
    Mingardi, Chiara
    Nunzi, Giorgio
    Brunner, Marcus
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Decentralized In-Network Management for the Future Internet2009Ingår i: 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION WORKSHOPS, NEW YORK: IEEE , 2009, s. 803-807Konferensbidrag (Refereegranskat)
    Abstract [en]

    In-network management (INM) is a new paradigm for the management of the future Internet that is based on the principles of decentralization and self-organization. Its goal is to overcome the limitations of traditional network management and to achieve scalable and robust management systems with low complexity for large-scale, dynamic network environments. In this paper, we describe a framework for INM that provides a systematic approach to the embedding of management algorithms within the elements of a communication networks. In addition, we demonstrate the benefits of decentralized management in the context of two key management functions, namely real-time monitoring and event handling.

  • 27.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Adaptive distributed monitoring with accuracy objectives2006Ingår i: Proceedings of the 2006 SIGCOMM Workshop on Internet Network Management, INM'06, 2006, Vol. 2006, s. 65-70Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present A-GAP, a novel protocol for continuous monitoring of network state variables, which aims at achieving a given monitoring accuracy with minimal overhead. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. The accuracy objective is expressed as the average estimation error. A-GAP is decentralized and asynchronous to achieve robustness and scalability. It executes on an overlay that interconnects management processes on the devices. On this overlay, the protocol maintains a spanning tree and updates the network state variables through incremental aggregation. It dynamically configures local filters that control whether an update is sent towards the root of the tree. It reduces the overhead by attempting to minimize the maximum processing load over all management processes. We evaluate A-GAP through simulation using an ISP topology and real traces. The results show that we can effectively control the trade-off between accuracy and protocol overhead, that the overhead can be reduced significantly by allowing small errors, and that an accurate estimation of the error distribution can be provided in real-time.

  • 28.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsteori.
    Adaptive Performance Management for SMS Systems2009Ingår i: Journal of Network and Systems Management, ISSN 1064-7570, E-ISSN 1573-7705, Vol. 17, nr 4, s. 397-421Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a design for performance management of SMS systems. The design takes as input the administrator's performance objectives, which can be adjusted at run-time. Based on these objectives, the design takes the necessary actions to achieve them and it dynamically adapts to changing networking conditions. It does so by periodically solving a linear optimization problem that computes a new configuration for the SMS system. We have evaluated the design through extensive simulations in various scenarios using traces from a production SMS system. It has proved effective in achieving the administrator's performance objectives, and efficient in terms of computational cost. Our experiments also show that the design is adaptive, i.e., it effectively adapts the systems's configuration to changes in the networking conditions, in order to continuously meet the performance objectives. Finally, the feasibility of our design is proved through the development of a prototype on a commercial SMS platform.

  • 29.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    A-GAP: An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives2007Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, Vol. 4, nr 1, s. 2-12Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present A-GAP, a novel protocol for continuous monitoring of network state variables, which aims at achieving a given monitoring accuracy with minimal overhead. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. The accuracy objective is expressed as the average estimation error. A-GAP is decentralized and asynchronous to achieve robustness and scalability. It executes on an overlay that interconnects management processes on the devices. On this overlay, the protocol maintains a spanning tree and updates the network state variables through incremental aggregation. Based on a stochastic model, it dynamically configures local filters that control whether an update is sent towards the root of the tree. We evaluate A-GAP through simulation using real traces and two different types of topologies of up to 650 nodes. The results show that we can effectively control the trade-off between accuracy and protocol overhead, and that the overhead can be reduced by almost two orders of magnitude when allowing for small errors. The protocol quickly adapts to a node failure and exhibits short spikes in the estimation error. Lastly, it can provide an accurate estimate of the error distribution in real-time.

  • 30.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Controlling Performance Trade-offs in Adaptive Network Monitoring2009Ingår i: 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), IEEE , 2009, s. 359--366Konferensbidrag (Refereegranskat)
    Abstract [en]

    A key requirement for autonomic (i.e., self-*) management systems is a short adaptation time to changes in the networking conditions. In this paper, we show that the adaptation time of a distributed monitoring protocol can be controlled. We show this for A-CAP, a protocol for continuous monitoring of global metrics with controllable accuracy. We demonstrate through simulations that, for the case of A-GAP, the choice of the topology of the aggregation tree controls the tradeoff between adaptation time and protocol overhead in steady-state. Generally, allowing a larger adaptation time permits reducing the protocol overhead. Our results suggest that the adaptation time primarily depends on the height of the aggregation tree and that the protocol overhead is strongly influenced by the number of internal nodes. We outline how A-GAP can be extended to dynamically self-configure and to continuously adapt its configuration to changing conditions, in order to meet a set of performance objectives, including adaptation time, protocol overhead, and estimation accuracy.

  • 31.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Monitoring Flow Aggregates with Controllable Accuracy2007Ingår i: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, Vol. 4787, s. 64-75Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In this paper, we show the feasibility of real-time flow monitoringwith controllable accuracy in today’s IP networks. Our approach is based onNetflow and A-GAP. A-GAP is a protocol for continuous monitoring ofnetwork state variables, which are computed from device metrics usingaggregation functions, such as SUM, AVERAGE and MAX. A-GAP isdesigned to achieve a given monitoring accuracy with minimal overhead. AGAPis decentralized and asynchronous to achieve robustness and scalability.The protocol incrementally computes aggregation functions inside the networkand, based on a stochastic model, it dynamically configures local filters thatcontrol the overhead and accuracy. We evaluate a prototype in a testbed of 16commercial routers and provide measurements from a scenario where theprotocol continuously estimates the total number of FTP flows in the network.Local flow metrics are read out from Netflow buffers and aggregated in realtime.We evaluate the prototype for the following criteria. First, the ability toeffectively control the trade off between monitoring accuracy and processingoverhead; second, the ability to accurately predict the distribution of theestimation error ; third, the impact of a sudden change in topology on theperformance of the protocol. The testbed measurements are consistent withsimulation studies we performed for different topologies and network sizes,which proves the feasibility of the protocol design, and, more generally, thefeasibility of effective and efficient real-time flow monitoring in large networkenvironments.

  • 32.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Real-time Network Monitoring Supporting Percentile Error Objectives2007Ingår i: 14th HP Software University Association (HP-SUA) Workshop, 8-11 July 2007,Munich, Germany, 2007Konferensbidrag (Refereegranskat)
    Abstract [en]

    We report on the versatility of A-GAP for supporting different typesof accuracy objectives. Previously, we considered accuracy objectivesexpressed in terms of the average error. In this paper, we focus on percentileerror objectives. A-GAP is a protocol for continuous monitoring of networkstate variables. Network state variables are computed from device countersusing aggregation functions, such as SUM, AVERAGE and MAX. A-GAP isdesigned to achieve a given monitoring accuracy with minimal overhead. AGAPis decentralized and asynchronous to achieve robustness and scalability. Itexecutes on an overlay that interconnects management processes on the devices.On this overlay, the protocol maintains a spanning tree and updates the networkstate variables through incremental aggregation. Based on a stochastic model, itdynamically configures local filters that control whether an update is senttowards the root of the tree. We evaluate A-GAP through simulation using realtraces for an ISP topology (Abovenet). The results prove the versatility of AGAPfor supporting different types of accuracy objectives. The results alsoshow that we can effectively control the trade-off between accuracy andprotocol overhead, and that the overhead can be reduced significantly byallowing small errors.

  • 33.
    Gonzalez Prieto, Alberto
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Kersch, P.
    Szabo, R.
    Nunzi, G.
    Brunner, M.
    Schuetz, S.
    Distributed management in Ambient Networks2007Ingår i: 2007 PROCEEDINGS OF THE 16TH IST MOBILE AND WIRELESS COMMUNICATIONS, NEW YORK: IEEE , 2007, s. 1091-1095Konferensbidrag (Refereegranskat)
    Abstract [en]

    Traditional centralized management approaches are not suitable for Ambient Networks (ANs), since centralized management systems neither scales well nor adapts fast enough to changing topologies and network compositions. To meet the requirements for AN management systems, we propose the use of distributed approaches. Specifically, we demonstrate the validity of these approaches through three instantiations: (i) a solution for real-time AN monitoring, (ii) a solution for load balancing in wireless networks and (iii) a solution for resource discovery in AN.

  • 34.
    Javier, Baliosian
    et al.
    Ericsson Ireland Research Centre, Athlone, Ireland.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Distributed Auto-configuration of Neighboring Cell Graphs in Radio Access Networks2010Ingår i: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 7, nr 3, s. 145-157Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    In order to execute a handover processes in a GSMor UMTS Radio Access Network, each cell has a list of neighborsto which such handovers may be made. Today, these lists arestatically configured during network planning, which does notallow for dynamic adaptation of the network to changes andunexpected events such as a cell failure. This paper advocatesan autonomic, decentralized approach to dynamically configureneighboring cell lists. The main contribution of this work isa novel protocol, called DOC, which detects and continuouslytracks the coverage overlaps among cells. The protocol executeson a spanning tree where the nodes are radio base stations andthe links represent communication channels. Over this tree, nodesperiodically exchange information about terminals that are intheir respective coverage area. Bloom filters are used for efficientrepresentations of terminal sets and efficient set operations. Theprotocol aggregates Bloom filters to reduce the communicationoverhead and also for routing messages along the tree. Usingsimulation, we study the system in steady state, when a basestation is added or a base station fails, and also during theinitialization phase where the system self-configures.

  • 35. Jennings, Brendan
    et al.
    Stadler, Rolf
    Resource Management in Clouds: Survey and Research Challenges2015Ingår i: Journal of Network and Systems Management, ISSN 1064-7570, E-ISSN 1573-7705, Vol. 23, nr 3, s. 567-619Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Resource management in a cloud environment is a hard problem, due to: the scale of modern data centers; the heterogeneity of resource types and their interdependencies; the variability and unpredictability of the load; as well as the range of objectives of the different actors in a cloud ecosystem. Consequently, both academia and industry began significant research efforts in this area. In this paper, we survey the recent literature, covering 250+ publications, and highlighting key results. We outline a conceptual framework for cloud resource management and use it to structure the state-of-the-art review. Based on our analysis, we identify five challenges for future investigation. These relate to: providing predictable performance for cloud-hosted applications; achieving global manageability for cloud systems; engineering scalable resource management systems; understanding economic behavior and cloud pricing; and developing solutions for the mobile cloud paradigm.

  • 36.
    Johansson, Björn
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Reglerteknik.
    Adam, Constantin
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Johansson, Mikael
    KTH, Skolan för elektro- och systemteknik (EES), Reglerteknik.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Distributed resource allocation strategies for achieving quality of service in server clusters2006Ingår i: PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, 2006, s. 1990-1995Konferensbidrag (Refereegranskat)
    Abstract [en]

    We investigate the resource allocation problem for large-scale server clusters with quality-of-service objectives, where key functions are decentralized. Specifically, the optimal service selection is posed as a discrete utility maximization problem that reflects management objectives and resource constraints. We develop an efficient centralized algorithm that solves this problem, and we propose three suboptimal schemes that operate with local information. The performance of the suboptimal schemes is evaluated in simulations, both under idealized conditions and in a full-scale system simulator.

  • 37.
    Jurca, Dan
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Computing Histograms of Local Variables for Real-Time Monitoring using Aggregation Trees2009Ingår i: 2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009) VOLS 1 AND 2, NEW YORK: IEEE , 2009, s. 367-374Konferensbidrag (Refereegranskat)
    Abstract [en]

    In this paper we present a protocol for the continuous monitoring of a local network state variable. Our aim is to provide a management station with the value distribution of the local variables across the network, by means of partial histogram aggregation, with minimum protocol overhead. Our protocol is decentralized and asynchronous to achieve robustness and scalability, and it executes on an overlay interconnecting management processes in network devices. On this overlay, the protocol maintains a spanning tree and updates the histogram of the network state variables through incremental aggregation. The protocol allows to control the trade-off between protocol overhead and a global accuracy objective. This functionality is implemented by a dynamic configuration of local error filters that control whether an update is sent towards the management station or not. We evaluate our protocol by means of simulations. Our results demonstrate the controllability of our method in a wide selection of scenarios, and the scalability of our protocol for large-scale networks.

  • 38. Krishnamurthy, Supriya
    et al.
    Ardelius, John
    Aurell, Erik
    KTH, Skolan för datavetenskap och kommunikation (CSC), Beräkningsbiologi, CB. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Dam, Mads
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Wuhib, Fetahi Zebenigus
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Brief Announcement: The Accuracy of Tree-based Counting in Dynamic Networks2010Ingår i: PODC 2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, NEW YORK: ASSOC COMPUTING MACHINERY , 2010, s. 291-292Konferensbidrag (Refereegranskat)
    Abstract [en]

    We study a simple Bellman-Ford-like protocol which performs network size estimation over a tree-shaped overlay. A continuous time Markov model is constructed which allows key protocol characteristics to be estimated under churn, including the expected number of nodes at a given (perceived) distance to the root and, for each such node, the expected (perceived) size of the subnetwork rooted at that node. We validate the model by simulations, using a range of network sizes, node degrees, and churn-to-protocol rates, with convincing results.

  • 39. Lim, K S
    et al.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Real-time views of network traffic using decentralized management2005Ingår i: Integrated Network Management IX: MANAGING NEW NETWORKED WORLDS, NEW YORK: IEEE , 2005, s. 119-132Konferensbidrag (Refereegranskat)
    Abstract [en]

    The ability to create views of a network on a fast time scale becomes increasingly important as the complexity and diversity of networks increase. These views, which combine information from many distributed points in the network, can provide an administrator with a better understanding of the interdependencies and interactions between network elements and traffic conditions. Applications that could benefit from being able to compute such "near" real-time views of the network range from performance monitoring to fault management. In this paper, we present the architecture of a distributed management infrastructure that enables such views to be computed. Based on our earlier work on decentralized management, our architecture takes a novel database approach that combines the expressive power of SQL with distributed algorithms. We describe the implementation of the system on platform of embedded Linux devices attached to a network of routers. We provide specific examples of how the system can be used as a powerful distributed real-time monitoring platform. Finally, we derive a performance model of the system and validate it with a set of experiments.

  • 40. Miyazawa, M.
    et al.
    Hayashi, M.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    VNMF: Distributed fault detection using clustering approach for network function virtualization2015Ingår i: Proceedings of the 2015 IFIP/IEEE International Symposium on Integrated Network Management, IM 2015, IEEE conference proceedings, 2015, s. 640-645Konferensbidrag (Refereegranskat)
    Abstract [en]

    Network function virtualization introduces additional complexity for network management through the use of virtualization environments. The amount of managed data and the operational complexity increases, which makes service assurance and failure recovery harder to realize. In response to this challenge, the paper proposes a distributed management function, called virtualized network management function (vNMF), to detect failures related to virtualized services. vNMF detects the failures by monitoring physical-layer statistics that are processed with a self-organizing map algorithm. Experimental results show that memory leaks and network congestion failures can be successfully detected and that and the accuracy of failure detection can be significantly improved compared to common k-means clustering.

  • 41.
    Moradi, Farnaz
    et al.
    Ericsson Res, Stockholm, Sweden..
    Stadler, Rolf
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Nätverk och systemteknik. Swedish Inst Comp Sci RISE SICS, Stockholm, Sweden..
    Johnsson, Andreas
    Ericsson Res, Stockholm, Sweden..
    Performance Prediction in Dynamic Clouds using Transfer Learning2019Ingår i: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, IEEE, 2019, s. 242-250, artikel-id 8717847Konferensbidrag (Refereegranskat)
    Abstract [en]

    Learning a performance model for a cloud service is challenging since its operational environment changes during execution, which requires re-training of the model in order to maintain prediction accuracy. Training a new model from scratch generally involves extensive new measurements and often generates a data-collection overhead that negatively affects the service performance. In this paper, we investigate an approach for re-training neural-network models, which is based on transfer learning. Under this approach, a limited number of neural-network layers are re-trained while others remain unchanged. We study the accuracy of the re-trained model and the efficiency of the method with respect to the number of re-trained layers and the number of new measurements. The evaluation is performed using traces collected from a testbed that runs a Video-on-Demand service and a Key-Value Store under various load conditions. We study model re-training after changes in load pattern, infrastructure configuration, service configuration, and target metric. We find that our method significantly reduces the number of new measurements required to compute a new model after a change. The reduction exceeds an order of magnitude in most cases.

  • 42. Niwa, T.
    et al.
    Miyazawa, M.
    Hayashi, M.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Universal fault detection for NFV using SOM-based clustering2015Ingår i: 17th Asia-Pacific Network Operations and Management Symposium: Managing a Very Connected World, IEEE , 2015, s. 315-320Konferensbidrag (Refereegranskat)
    Abstract [en]

    Network function virtualization (NFV) introduces additional complexity to network management, since the placement and behavior of virtualized network functions (VNFs) can be independent from the underlying hardware, and virtualization technology increases the number of monitoring points and the amount of statistical data. In our previous work, we proposed a framework for detecting anomalous behavior of VNFs using a SOM-based technique. The solution relies upon manually configuring the SOM clustering parameters and selecting the statistics for each failure type in advance, which results in a high maintenance load. In this paper, we provide a solution that is universal in the sense that a range of different faults can be detected using a single set of local statistics and SOM clustering parameters. Experimental results from a testbed show that faults, including memory leak, packet congestion, and session congestion, can be detected with high accuracy using only four types of performance statistics.

  • 43.
    Palmskog, Karl
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Gonzalez Prieto, Alberto
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Meirosu, Catalin
    Ericsson.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Dam, Mads
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS.
    Scalable Metadata-Directed Search in a Network of Information2010Ingår i: Future Network and MobileSummit 2010 Conference Proceedings, International Information Management Corporation Limited, 2010, s. 5722376-Konferensbidrag (Refereegranskat)
    Abstract [en]

    The information-centric paradigm has been recently proposed for the design of future networking systems. A key requirement for realising such systems is having mechanisms that provide efficient, scalable and accurate information search. In this paper, we present solutions for both one-time and continuous searches. Our solution for one-time searches is scalable for its search completion time grows sublinearly with the system size. In addition, the overhead it introduces is evenly distributed. For our solution for continuous searches, we discuss its tradeoff between load (efficiency) and timeliness (accuracy).

  • 44.
    Pasquini, Rafael
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. SICS, Kista, Sweden.
    Moradi, Farnaz
    Ahmed, Jawwad
    Johnsson, Andreas
    Flinta, Christofer
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Nätverk och systemteknik. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. SICS, Kista, Sweden.
    Predicting SLA Conformance for Cluster-based Services2017Ingår i: 2017 IFIP Networking Conference (IFIP NETWORKING) and Workshops, IEEE, 2017Konferensbidrag (Refereegranskat)
    Abstract [en]

    The ability to predict conformance or violation for given Service-level Agreements (SLAs) is critical for service assurance. We demonstrate a prototype for real-time conformance prediction based on the concept of the capacity region, which abstracts the underlying ICT infrastructure with respect to the load it can carry for a given SLA. The capacity region is estimated through measurements and statistical learning. We demonstrate prediction for a key-value store (Voldemort) that runs on a server cluster located at KTH.

  • 45.
    Pasquini, Rafael
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre. Fac Comp FACOM UFU, Uberlandia, MG, Brazil; Swedish Inst Comp Sci, Stockholm, Sweden.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Learning End-to-end Application QoS from OpenFlow Switch Statistics2017Ingår i: 2017 IEEE CONFERENCE ON NETWORK SOFTWARIZATION (IEEE NETSOFT), IEEE , 2017Konferensbidrag (Refereegranskat)
    Abstract [en]

    We use statistical learning to estimate end-to-end QoS metrics from device statistics, collected from a server cluster and an OpenFlow network. The results from our testbed, which runs a video-on-demand service and a key-value store, demonstrate that the learned models can estimate QoS metrics like frame rate or response time with errors bellow 10% for a given client. Interestingly, we find that service-level QoS metrics seem "encoded" in network statistics and it suffices to collect OpenFlow per port statistics to achieve accurate estimation at small overhead for data collection and model computation.

  • 46. Pras, A.
    et al.
    Schönwälder, J.
    Burgess, M.
    Festor, O.
    Martínez Pérez, G.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stiller, B.
    Key research challenges in network management2007Ingår i: IEEE Communications Magazine, ISSN 0163-6804, E-ISSN 1558-1896, Vol. 45, nr 10, s. 104-110Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Although network management has always played a key role for industry, it only recently received a similar level of attention from many research communities, accelerated by funding opportunities from new initiatives, including the FP7 Program in Europe and GENI/FIND in the United States. Work is ongoing to assess the state of the art and identify the challenges for future research in the field, and this article contributes to this discussion. It presents major findings from a two-day workshop organized jointly by the IRTF/NMRG and the EMANICS Network of Excellence, at which researchers, operators, vendors, and technology developers discussed the research directions to be pursued over the next five years. The workshop identified several topic areas, including management architectures, distributed real-time monitoring, data analysis and visualization, ontologies, economic aspects of management, uncertainty and probabilistic approaches, as well as understanding the behavior of managed systems.

  • 47.
    Prieto, Alberto Gonzalez
    et al.
    KTH, Tidigare Institutioner, Mikroelektronik och informationsteknik, IMIT.
    Cosenza, R.
    Stadler, Rolf
    KTH, Tidigare Institutioner, Mikroelektronik och informationsteknik, IMIT.
    Policy-based congestion management for an SMS gateway2004Ingår i: FIFTH IEEE INTERNATIONAL WORKSHOP ON POLICIES FOR DISTRIBUTED SYSTEMS AND NETWORKS, PROCEEDINGS, LOS ALAMITOS: IEEE COMPUTER SOC , 2004, s. 215-218Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a policy-based approach to managing congestions in Short Message Service (SMS) systems. Congestion situations typically occur on SMS Gateways (SMSGs), which route SMS messages between different networks and domains. In our architecture, an SMS operator can dynamically define the maximum acceptable loss of messages of a non-guaranteed SMS service class, thereby controlling the trade-off between minimal message loss and maximum throughput in an SMS system. We present the functional architecture of a manageable SMSG and discuss the realization of the Policy Decision Point (PDP), which applies the congestion policy on the SMSG. An implementation of our architecture on a commercial SMSG, the EMG, is underway.

  • 48.
    Prieto, Alberto Gonzalez
    et al.
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät.
    Adaptive Real-time Monitoring for Large-scale Networked Systems: Dissertation Session2009Ingår i: 2009 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2009) VOLS 1 AND 2, NEW YORK: IEEE , 2009, s. 790-795Konferensbidrag (Refereegranskat)
    Abstract [en]

    The focus of this thesis is continuous real-time monitoring, which is essential for the realization of adaptive management systems in large-scale dynamic environments. Real-time monitoring provides the necessary input to the decision-making process of network management. We have developed, implemented, and evaluated a design for real-time continuous monitoring of global metrics with performance objectives, such as monitoring overhead and estimation accuracy. Global metrics describe the state of the system as a whole, in contrast to local metrics, such as device counters or local protocol states, which capture the state of a local entity. Global metrics are computed from local metrics using aggregation functions, such as SUM, AVERAGE and MAX. A key part in the design is a model for the distributed monitoring process that relates performance metrics to parameters that tune the behavior of a monitoring protocol. The model has been instrumental in designing a monitoring protocol that is controllable and achieves given performance objectives. Our design has proved to be effective in meeting performance objectives, efficient, adaptive to changes in the networking conditions, controllable along different performance dimensions, and scalable. We have implemented a prototype on a testbed of commercial routers, which proves the feasibility of the design, and, more generally, the feasibility of effective and efficient real-time monitoring in large network environments.

  • 49.
    Raz, Danny
    et al.
    Computer Science Department, Technion – Israel Institute of Technology.
    Stadler, Rolf
    KTH, Skolan för elektro- och systemteknik (EES), Kommunikationsnät. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    Elster, Constantine
    Qualcomm Israel.
    Dam, Mads
    KTH, Skolan för datavetenskap och kommunikation (CSC), Teoretisk datalogi, TCS. KTH, Skolan för elektro- och systemteknik (EES), Centra, ACCESS Linnaeus Centre.
    In-Network Monitoring2010Ingår i: Algorithms for Next Generation Networks / [ed] Graham Cormode and Marina Thottan, Springer Publishing Company, 2010Kapitel i bok, del av antologi (Refereegranskat)
  • 50.
    Samani, Forough Shahab
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Nätverk och systemteknik.
    Stadler, Rolf
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Nätverk och systemteknik.
    Predicting Distributions of Service Metrics using Neural Networks2018Ingår i: 2018 14TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM) / [ed] Salsano, S Riggio, R Ahmed, T Samak, T DosSantos, CRP, IEEE , 2018, s. 45-53Konferensbidrag (Refereegranskat)
    Abstract [en]

    We predict the conditional distributions of service metrics, such as response time or frame rate, from infrastructure measurements in a cloud environment. From such distributions, key statistics of the service metrics, including mean, variance, or percentiles can be computed, which are essential for predicting SLA conformance or enabling service assurance. We model the distributions as Gaussian mixtures, whose parameters we predict using mixture density networks, a class of neural networks. We apply the method to a Voll service and a KY store running on our lab testbed. The results validate the effectiveness of the method when applied to operational data. In the case of predicting the mean of the frame rate or response time, the accuracy matches that of random forest, a baseline model.

12 1 - 50 av 91
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf