Endre søk
Begrens søket
1 - 13 of 13
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Bendtsen, Marcus
    Linköpings universitet, Institutionen för datavetenskap, Databas och informationsteknik. Linköpings universitet, Tekniska fakulteten.
    Regimes in baseball players' career data2017Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 31, nr 6, 1580-1621 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In this paper we investigate how we can use gated Bayesian networks, a type of probabilistic graphical model, to represent regimes in baseball players’ career data. We find that baseball players do indeed go through different regimes throughout their career, where each regime can be associated with a certain level of performance. We show that some of the transitions between regimes happen in conjunction with major events in the players’ career, such as being traded or injured, but that some transitions cannot be explained by such events. The resulting model is a tool for managers and coaches that can be used to identify where transitions have occurred, as well as an online monitoring tool to detect which regime the player currently is in.

  • 2. Corander, Jukka
    et al.
    Ekdahl, Magnus
    Koski, Timo
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Matematisk statistik.
    Parallell interacting MCMC for learning of topologies of graphical models2008Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 17, nr 3, 431-456 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis-Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

  • 3.
    Corander, Jukka
    et al.
    Department of Mathematics, Åbo Akademi University, Åbo, Finland.
    Ekdahl, Magnus
    Linköpings universitet, Matematiska institutionen, Matematisk statistik. Linköpings universitet, Tekniska högskolan.
    Koski, Timo
    Department of Mathematics, Royal Institute of Technology, Stockholm, Sweden.
    Parallell interacting MCMC for learning of topologies of graphical models2008Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 17, nr 3, 431-456 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis–Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

  • 4. Henelius, Andreas
    et al.
    Puolamaki, Kai
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    A peek into the black box: exploring classifiers by randomization2014Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 28, nr 5-6, 1503-1529 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Classifiers are often opaque and cannot easily be inspected to gain understanding of which factors are of importance. We propose an efficient iterative algorithm to find the attributes and dependencies used by any classifier when making predictions. The performance and utility of the algorithm is demonstrated on two synthetic and 26 real-world datasets, using 15 commonly used learning algorithms to generate the classifiers. The empirical investigation shows that the novel algorithm is indeed able to find groupings of interacting attributes exploited by the different classifiers. These groupings allow for finding similarities among classifiers for a single dataset as well as for determining the extent to which different classifiers exploit such interactions in general.

  • 5.
    Karlsson, Isak
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Generalized random shapelet forests2016Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 30, nr 5, 1053-1085 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Shapelets are discriminative subsequences of time series, usually embedded in shapelet-based decision trees. The enumeration of time series shapelets is, however, computationally costly, which in addition to the inherent difficulty of the decision tree learning algorithm to effectively handle high-dimensional data, severely limits the applicability of shapelet-based decision tree learning from large (multivariate) time series databases. This paper introduces a novel tree-based ensemble method for univariate and multivariate time series classification using shapelets, called the generalized random shapelet forest algorithm. The algorithm generates a set of shapelet-based decision trees, where both the choice of instances used for building a tree and the choice of shapelets are randomized. For univariate time series, it is demonstrated through an extensive empirical investigation that the proposed algorithm yields predictive performance comparable to the current state-of-the-art and significantly outperforms several alternative algorithms, while being at least an order of magnitude faster. Similarly for multivariate time series, it is shown that the algorithm is significantly less computationally costly and more accurate than the current state-of-the-art.

  • 6. Kostakis, Orestis
    et al.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Finding the longest common sub-pattern in sequences of temporal intervals2015Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, nr 5, 1178-1210 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We study the problem of finding the longest common sub-pattern (LCSP) shared by two sequences of temporal intervals. In particular we are interested in finding the LCSP of the corresponding arrangements. Arrangements of temporal intervals are a powerful way to encode multiple concurrent labeled events that have a time duration. Discovering commonalities among such arrangements is useful for a wide range of scientific fields and applications, as it can be seen by the number and diversity of the datasets we use in our experiments. In this paper, we define the problem of LCSP and prove that it is NP-complete by demonstrating a connection between graphs and arrangements of temporal intervals. This connection leads to a series of interesting open problems. In addition, we provide an exact algorithm to solve the LCSP problem, and also propose and experiment with three polynomial time and space under-approximation techniques. Finally, we introduce two upper bounds for LCSP and study their suitability for speeding up 1-NN search. Experiments are performed on seven datasets taken from a wide range of real application domains, plus two synthetic datasets. Lastly, we describe several application cases that demonstrate the need and suitability of LCSP.

  • 7. Kostakis, Orestis
    et al.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    On Searching and Indexing Sequences of Temporal Intervals2017Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 31, nr 3, 809-850 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In several application domains, including sign language, sensor networks, and medicine, events are not necessarily instantaneous but they may have a time duration. Such events build sequences of temporal intervals, which may convey useful domain knowledge; thus, searching and indexing these sequences is crucial. We formulate the problem of comparing sequences of labeled temporal intervals and present a distance measure that can be computed in polynomial time. We prove that the distance measure is metric and satisfies the triangle inequality. For speeding up search in large databases of sequences of temporal intervals, we propose an approximate indexing method that is based on embeddings. The proposed indexing framework is shown to be contractive and can guarantee no false dismissal. The distance measure is tested and benchmarked through rigorous experimentation on real data taken from several application domains, including: American Sign Language annotated video recordings, robot sensor data, and Hepatitis patient data. In addition, the indexing scheme is tested on a large synthetic dataset. Our experiments show that speedups of over an order of magnitude can be achieved while maintaining high levels of accuracy. As a result of our work, it becomes possible to implement recommender systems, search engines and assistive applications for the fields that employ sequences of temporal intervals.

  • 8. Kotsifakos, Alexios
    et al.
    Stefan, Alexandra
    Athitsos, Vassilis
    Das, Gautam
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    DRESS: dimensionality reduction for efficient sequence search2015Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, nr 5, 1280-1311 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Similarity search in large sequence databases is a problem ubiquitous in a wide range of application domains, including searching biological sequences. In this paper we focus on protein and DNA data, and we propose a novel approximate method method for speeding up range queries under the edit distance. Our method works in a filter-and-refine manner, and its key novelty is a query-sensitive mapping that transforms the original string space to a new string space of reduced dimensionality. Specifically, it first identifies the most frequent codewords in the query, and then uses these codewords to convert both the query and the database to a more compact representation. This is achieved by replacing every occurrence of each codeword with a new letter and by removing the remaining parts of the strings. Using this new representation, our method identifies a set of candidate matches that are likely to satisfy the range query, and finally refines these candidates in the original space. The main advantage of our method, compared to alternative methods for whole sequence matching under the edit distance, is that it does not require any training to create the mapping, and it can handle large query lengths with negligible losses in accuracy. Our experimental evaluation demonstrates that, for higher range values and large query sizes, our method produces significantly lower costs and runtimes compared to two state-of-the-art competitor methods.

  • 9. Lijffijt, Jefrey
    et al.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Puolamaki, Kai
    Size matters: choosing the most informative set of window lengths for mining patterns in event sequences2015Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, nr 6, 1838-1864 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher level of granularity. Selecting the appropriate granularity is a challenging task and often no principled solutions exist. This problem is particularly relevant in analysis of data with sequential structure. We consider this problem for a specific type of data, namely event sequences. We introduce the problem of finding the best set of window lengths for analysis of event sequences for algorithms with real-valued output. We present suitable criteria for choosing one or multiple window lengths and show that these naturally translate into a computational optimisation problem. We show that the problem is NP-hard in general, but that it can be approximated efficiently and even analytically in certain cases. We give examples of tasks that demonstrate the applicability of the problem and present extensive experiments on both synthetic data and real data from several domains. We find that the method works well in practice, and that the optimal sets of window lengths themselves can provide new insight into the data.

  • 10. Lijffijt, jefrey
    et al.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Puolamäki, Kai
    A statistical significance testing approach to mining the most informative set of patterns2014Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 28, nr 1, 238-263 s.Artikkel i tidsskrift (Fagfellevurdert)
  • 11.
    Norén, G. Niklas
    et al.
    Stockholms universitet, Naturvetenskapliga fakulteten, Matematiska institutionen.
    Hopstadius, Johan
    Bate, Andrew
    Star, Kristina
    Edwards, I. Ralph
    Temporal pattern discovery in longitudinal electronic patient records2010Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 20, nr 3, 361-387 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Large collections of electronic patient records provide a vast but still underutilised source of information on the real world use of medicines. They are maintained primarily for the purpose of patient administration, but contain a broad range of clinical information highly relevant for data analysis. While they are a standard resource for epidemiological confirmatory studies, their use in the context of exploratory data analysis is still limited. In this paper, we present a framework for open-ended pattern discovery in large patient records repositories. At the core is a graphical statistical approach to summarising and visualising the temporal association between the prescription of a drug and the occurrence of a medical event. The graphical overview contrasts the observed and expected number of occurrences of the medical event in different time periods both before and after the prescription of interest. In order to effectively screen for important temporal relationships, we introduce a new measure of temporal association, which contrasts the observed-to-expected ratio in a time period immediately after the prescription to the observed-to-expected ratio in a control period 2 years earlier. An important feature of both the observed-to-expected graph and the measure of temporal association is a statistical shrinkage towards the null hypothesis of no association, which provides protection against highlighting spurious associations. We demonstrate the usefulness of the proposed pattern discovery methodology by a set of examples from a collection of over two million patient records in the United Kingdom. The identified patterns include temporal relationships between drug prescriptions and medical events suggestive of persistent and transient risks of adverse events, possible beneficial effects of drugs, periodic co-occurrence, and systematic tendencies of patients to switch from one medication to another.

  • 12. Pensar, Johan
    et al.
    Nyman, Henrik
    Koski, Timo
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.), Matematisk statistik.
    Corander, Jukka
    Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models2015Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, nr 2, 503-533 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.

  • 13.
    Rögnvaldsson, Thorsteinn
    et al.
    Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR Centrum för tillämpade intelligenta system (IS-lab).
    Nowaczyk, Sławomir
    Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR Centrum för tillämpade intelligenta system (IS-lab).
    Byttner, Stefan
    Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), CAISR Centrum för tillämpade intelligenta system (IS-lab).
    Prytz, Rune
    Volvo Group Trucks Technology, Göteborg, Sweden.
    Svensson, Magnus
    Volvo Group Trucks Technology, Göteborg, Sweden.
    Self-monitoring for maintenance of vehicle fleets2017Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756XArtikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    An approach for intelligent monitoring of mobile cyberphysical systems is described, based on consensus among distributed self-organised agents. Its usefulness is experimentally demonstrated over a long-time case study in an example domain: a fleet of city buses. The proposed solution combines several techniques, allowing for life-long learning under computational and communication constraints. The presented work is a step towards autonomous knowledge discovery in a domain where data volumes are increasing, the complexity of systems is growing, and dedicating human experts to build fault detection and diagnostic models for all possible faults is not economically viable. The embedded, self-organised agents operate on-board the cyberphysical systems, modelling their states and communicating them wirelessly to a back-office application. Those models are subsequently compared against each other to find systems which deviate from the consensus. In this way the group (e.g. a fleet of vehicles) is used to provide a standard, or to describe normal behaviour, together with its expected variability under particular operating conditions. The intention is to detect faults without the need for human experts to anticipate them beforehand. This can be used to build up a knowledge base that accumulates over the life-time of the systems. The approach is demonstrated using data collected during regular operation of a city bus fleet over the period of almost four years. © 2017 The Author(s)

1 - 13 of 13
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf