Endre søk
Begrens søket
1 - 22 of 22
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Alemu Argaw, Atelach
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Cöster, Rickard
    SICS.
    Karlgren, Jussi
    SICS.
    Sahlgren, Magnus
    SICS.
    Dictionary-based Amharic-French information retrieval2006Inngår i: Accessing multilingual information repositories: 6th workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, revised selected papers / [ed] Carol Peters, Fredric C. Gey, Julio Gonzalo, Henning Müller, Gareth J. F. Jones, Michael kluck, Bernardo Magnini, Maarten de Rijke, Berlin: Springer Berlin/Heidelberg, 2006, 83-92 s.Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    We present four approaches to the Amharic - French bilingual track at CLEF 2005. All experiments use a dictionary based approach to translate the Amharic queries into French Bags-of-words, but while one approach uses word sense discrimination on the translated side of the queries, the other one includes all senses of a translated word in the query for searching. We used two search engines: The SICS experimental engine and Lucene, hence four runs with the two approaches. Non-content bearing words were removed both before and after the dictionary lookup. TF/IDF values supplemented by a heuristic function was used to remove the stop words from the Amharic queries and two French stopwords lists were used to remove them from the French translations. In our experiments, we found that the SICS search engine performs better than Lucene and that using the word sense discriminated keywords produce a slightly better result than the full set of non discriminated keywords.

  • 2.
    Asker, Lars
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap. Programvaruutveckling.
    Alemu Argaw, Atelach
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    An Amharic Stemmer : Reducing Words to their Citation Forms2007Inngår i: Computational Approaches to Semitic Languages: Common Issues and Resources, 2007Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Stemming is an important analysis step in a number of areas such as natural language processing (NLP), information retrieval (IR), machine translation(MT) and text classification. In this paper we present the development of a stemmer for Amharic that reduces words to their citation forms. Amharic is a Semitic language with rich and complex morphology. The application of such a stemmer is in dictionary based cross language IR, where there is a need in the translation step, to look up terms in a machine readable dictionary (MRD). We apply a rule based approach supplemented by occurrence statistics of words in a MRD and in a 3.1M words news corpus. The main purpose of the statistical upplements is to resolve ambiguity between alternative segmentations. The stemmer is evaluated on Amharic text from two domains, news articles and a classic fiction text. It is shown to have an accuracy of 60% for the old fashioned fiction text and 75% for the news articles.

  • 3.
    Asker, Lars
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Alemu Argaw, Atelach
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Gambäck, Björn
    Norwegian University of Science and Technology, Trondheim, Norway; SICS, Swedish Institute of Computer Science AB, Kista, Sweden.
    Eyassu, Samuel
    Addis Ababa University, Addis Ababa, Ethiopia.
    Nigussie, Lemma
    Addis Ababa University, Addis Ababa, Ethiopia.
    Classifying Amharic Webnews2009Inngår i: Information retrieval (Boston), ISSN 1386-4564, E-ISSN 1573-7659, Vol. 12, nr 3, 416-435 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    We present work aimed at compiling an Amharic corpus from the Web and automatically categorizing the texts. Amharic is the second most spoken Semitic language in the World (after Arabic) and used for countrywide communication in Ethiopia. It is highly inflectional and quite dialectally diversified. We discuss the issues of compiling and annotating a corpus of Amharic news articles from the Web. This corpus was then used in three sets of text classification experiments. Working with a less-researched language highlights a number of practical issues that might otherwise receive less attention or go unnoticed. The purpose of the experiments has not primarily been to develop a cutting-edge text classification system for Amharic, but rather to put the spotlight on some of these issues. The first two sets of experiments investigated the use of Self-Organizing Maps (SOMs) for document classification. Testing on small datasets, we first looked at classifying unseen data into 10 predefined categories of news items, and then at clustering it around query content, when taking 16 queries as class labels. The second set of experiments investigated the effect of operations such as stemming and part-of-speech tagging on text classification performance. We compared three representations while constructing classification models based on bagging of decision trees for the 10 predefined news categories. The best accuracy was achieved using the full text as representation. A representation using only the nouns performed almost equally well, confirming the assumption that most of the information required for distinguishing between various categories actually is contained in the nouns, while stemming did not have much effect on the performance of the classifier.

  • 4.
    Asker, Lars
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Karlsson, Isak
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Zhao, Jing
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Mining Candidates for Adverse Drug Interactions in Electronic Patient Records2014Inngår i: PETRA '14 Proceedings of the 7th International Conference on Pervasive Technologies Related to Assistive Environments, PETRA’14, New York: ACM Press, 2014Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Electronic patient records provide a valuable source of information for detecting adverse drug events. In this paper, we explore two different but complementary approaches to extracting useful information from electronic patient records with the goal of identifying candidate drugs, or combinations of drugs, to be further investigated for suspected adverse drug events. We propose a novel filter-and-refine approach that combines sequential pattern mining and disproportionality analysis. The proposed method is expected to identify groups of possibly interacting drugs suspected for causing certain adverse drug events. We perform an empirical investigation of the proposed method using a subset of the Stockholm electronic patient record corpus. The data used in this study consists of all diagnoses and medications for a group of patients diagnoses with at least one heart related diagnosis during the period 2008--2010. The study shows that the method indeed is able to detect combinations of drugs that occur more frequently for patients with cardiovascular diseases than for patients in a control group, providing opportunities for finding candidate drugs that cause adverse drug effects through interaction.

  • 5.
    Asker, Lars
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Persson, Hans
    Identifying Factors for the Effectiveness of Treatment of Heart Failure: A Registry Study2016Inngår i: IEEE 29th International Symposiumon Computer-Based Medical Systems: CBMS 2016, IEEE Computer Society, 2016Konferansepaper (Fagfellevurdert)
    Abstract [en]

    An administrative health register containing health care data for over 2 million patients will be used to search for factors that can affect the treatment of heart failure. In the study, we will measure the effects of employed treatment for various groups of heart failure patients, using different measures of effectiveness. Significant deviations in effectiveness of treatments of the various patient groups will be reported and factors that may help explaining the effect of treatment will be analyzed. Identification of the most important factors that may help explain the observed deviations between the different groups will be derived through generation of predictive models, for which variable importance can be calculated. The findings may affect recommended treatments as well as high-lighting deviations from national guidelines.

  • 6.
    Asker, Lars
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Learning from Swedish Healthcare Data2016Inngår i: Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Association for Computing Machinery (ACM), 2016, 47Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present two ongoing projects aimed at learning from health care records. The first project, DADEL, is focusing on high-performance data mining for detrecting adverse drug events in healthcare, and uses electronic patient records covering seven years of patient record data from the Stockholm region in Sweden. The second project is focusing on heart failure and on understanding the differences in treatment between various groups of patients. It uses a Swedish administrative health register containing health care data for over two million patients.

  • 7. Henelius, Andreas
    et al.
    Puolamaki, Kai
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    A peek into the black box: exploring classifiers by randomization2014Inngår i: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 28, nr 5-6, 1503-1529 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Classifiers are often opaque and cannot easily be inspected to gain understanding of which factors are of importance. We propose an efficient iterative algorithm to find the attributes and dependencies used by any classifier when making predictions. The performance and utility of the algorithm is demonstrated on two synthetic and 26 real-world datasets, using 15 commonly used learning algorithms to generate the classifiers. The empirical investigation shows that the novel algorithm is indeed able to find groupings of interacting attributes exploited by the different classifiers. These groupings allow for finding similarities among classifiers for a single dataset as well as for determining the extent to which different classifiers exploit such interactions in general.

  • 8.
    Kareem, Hend
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Detecting Hierarchical Ties Using Link-Analysis Ranking at Different Levels of Time Granularity2017Annet (Annet vitenskapelig)
    Abstract [en]

    Social networks contain implicit knowledge that can be used to infer hierarchical relations that are not explicitly present in the available data. Interaction patterns are typically affected by users' social relations. We present an approach to inferring such information that applies a link-analysis ranking algorithm at different levels of time granularity. In addition, a voting scheme is employed for obtaining the hierarchical relations. The approach is evaluated on two datasets: the Enron email data set, where the goal is to infer manager-subordinate relationships, and the Co-author data set, where the goal is to infer PhD advisor-advisee relations. The experimental results indicate that the proposed approach outperforms more traditional approaches to inferring hierarchical relations from social networks.

  • 9. Kareem, Hend
    et al.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Detecting Hierarchical Ties Using Link-Analysis Ranking at Different Levels of Time Granularity2017Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Social networks contain implicit knowledge that can be used to infer hierarchical relations that are not explicitly present in the available data. Interaction patterns are typically affected by users' social relations. We present an approach to inferring such information that applies a link-analysis ranking algorithm at different levels of time granularity. In addition, a voting scheme is employed for obtaining the hierarchical relations. The approach is evaluated on two datasets: the Enron email data set, where the goal is to infer manager-subordinate relationships, and the Co-author data set, where the goal is to infer PhD advisor-advisee relations. The experimental results indicate that the proposed approach outperforms more traditional approaches to inferring hierarchical relations from social networks.

  • 10.
    Kareem, Hend
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Detecting Hierarchical Ties Using Link-Analysis Ranking at Different Levels of Time Granularity2017Inngår i: Artikkel i tidsskrift (Annet vitenskapelig)
    Abstract [en]

    Social networks contain implicit knowledge that can be used to infer hierarchical relations that are not explicitly present in the available data. Interaction patterns are typically affected by users' social relations. We present an approach to inferring such information that applies a link-analysis ranking algorithm at different levels of time granularity. In addition, a voting scheme is employed for obtaining the hierarchical relations. The approach is evaluated on two datasets: the Enron email data set, where the goal is to infer manager-subordinate relationships, and the Co-author data set, where the goal is to infer PhD advisor-advisee relations. The experimental results indicate that the proposed approach outperforms more traditional approaches to inferring hierarchical relations from social networks.

  • 11.
    Karlsson, Isak
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Multi-channel ECG classification using forests of randomized shapelet trees2015Inngår i: Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Association for Computing Machinery (ACM), 2015, 43Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Data series of multiple channels occur at high rates and in massive quantities in several application domains, such as healthcare. In this paper, we study the problem of multi-channel ECG classification. We map this problem to multivariate data series classification and propose five methods for solving it, using a split-and-combine approach. The proposed framework is evaluated using three base-classifiers on real-world data for detecting Myocardial Infarction. Extensive experiments are performed on real ECG data extracted from the Physiobank data repository. Our findings emphasize the importance of selecting an appropriate base-classifier for multivariate data series classification, while demonstrating the superiority of the Random Shapelet Forest (0.825 accuracy) against competitor methods (0.664 accuracy for 1-NN under cDTW).

  • 12.
    Karlsson, Isak
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Zhao, Jing
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Predicting Adverse Drug Events by Analyzing Electronic Patient Records2013Inngår i: Artificial Intelligence in Medicine: 14th Conference on Artificial Intelligence in Medicine, AIME 2013. Proceedings / [ed] Niels Peek, Roque Marín Morales, Mor Peleg, Springer Berlin/Heidelberg, 2013, Vol. 7885, 125-129 s.Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Diagnosis codes for adverse drug events (ADEs) are sometimes missing from electronic patient records (EPRs). This may not only affect patient safety in the worst case, but also the number of reported ADEs, resulting in incorrect risk estimates of prescribed drugs. Large databases of electronic patient records (EPRs) are potentially valuable sources of information to support the identification of ADEs. This study investigates the use of machine learning for predicting one specific ADE based on information extracted from EPRs, including age, gender, diagnoses and drugs. Several predictive models are developed and evaluated using different learning algorithms and feature sets. The highest observed AUC is 0.87, obtained by the random forest algorithm. The resulting model can be used for screening EPRs that are not, but possibly should be, assigned a diagnosis code for the ADE under consideration. Preliminary results from using the model are presented.

  • 13. Lundgren, Erik
    et al.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Extracting news text from web pages: an application for the visually impaired2015Inngår i: Proceeding PETRA '15 Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments, New York: ACM Press, 2015, Vol. Art. 68Konferansepaper (Fagfellevurdert)
  • 14.
    Rahman, Mofizur
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Skeppstedt, Maria
    Proposing distributional semantics as a tool for medical vocabulary expansion2015Inngår i: International Workshop on Embeddings and Semantics: SEPLN ‘15 / [ed] Parth Gupta, Rafael E. Banchs, Paolo Rosso, 2015Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A tool that extends a given vocabulary by automatically extracting new term candidates from a corpus could facilitate vocabulary expansion, as well as ensure that extracted terms correspond to those actually used in a specific text genre. We here propose a user interface for such a tool, and evaluate the feasibility of using Random Indexing for positioning new term candidates in a given taxonomy.

  • 15. Sotomane, Constantino
    et al.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Massingue, Venancio
    Factors Affecting the Use of Data Mining in Mozambique2013Inngår i: IST-Africa 2013 Conference Proceedings / [ed] Paul Cunningham, Miriam Cunningham, International Information Management Corporation Limited, 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a study aimed at finding important factors that affect the acceptance and use of data mining in Mozambique. Input from otential users has been collected and analysed using a mix of qualitative and quantitative methods. The findings indicate that the level of adoption of data mining in Mozambique is primarily affected by poor quality of data, limited skills and human resources, limited support of stakeholders, organizational issues, limited financial resources and lack of adequate technology. These factors are similar to those identified in other studies.

  • 16.
    Sotomane, Constantino
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap. Ministry Of Science and Technology, Mozambique.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Massingue, Venancion
    Short-term Forecasting of Electricity Consumption in Maputo2013Inngår i: International Conference on Advances in ICT for Emerging Regions (ICTer) - 2013: Conference Proceedings, IEEE Computer Society, 2013, 132-136 s.Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a short-term load forecasting model for Maputo. The model is based on the concept of multiple models. A clustering method is combined with expert’s knowledge to identify sub-models. The resulting model, which is the combination of several sub-models, is evaluated and compared to the model currently used by the Electricidade de Moçambique E.P (EDM). The results show that the developed model performs better accuracy than the one currently used by EDM. The results obtained by the application of the model when translated into financial figures demonstrate significant economic advantages. The social and environmental implications of the model are also analysed.

  • 17.
    Sotomane, Constantino
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Gallego-Ayala, Jordi
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Massingue, Venancio
    Extracting Patterns from Socioeconomic Databases to Characterize Small Farmers with High and Low Corn Yields in Mozambique: a Data Mining Approach2012Inngår i: Advances in Data Mining: Workshop Proceedings / [ed] Isabelle Bichindaritz, Petra Perner, Georg Ruß, Rainer Schmidt, Ibai Publishing , 2012, 99-108 s.Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Mozambique is mainly a rural country. Agriculture is a pillar of the Mozambique economy and is the main source of income for 80% of the population living in rural areas. One of the major problems in the agricultural sector is low productivity, which for most crops is the lowest in Africa. The main food crop cultivated in Mozambique is maize. This research aims to characterize households with high and low maize yields based on the National Agricultural Survey Data from 2007 and 2008 using a data mining approach. To this end, we used: a) decision trees, b) association rules, and c) classification rules. The results show that households with high maize yields are those with the capacity to generate income through the commercialization of their production and agricultural assets. Households with low maize yields are associated with production loss before harvest which results in food insecurity.

  • 18.
    Zhao, Jing
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Henriksson, Aron
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Detecting Adverse Drug Events with Multiple Representations of Clinical Measurements2014Inngår i: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM): Proceedings, IEEE Computer Society, 2014, 536-543 s.Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Adverse drug events (ADEs) are grossly under-reported in electronic health records (EHRs). This could be mitigated by methods that are able to detect ADEs in EHRs, thereby allowing for missing ADE-specific diagnosis codes to be identified and added. A crucial aspect of constructing such systems is to find proper representations of the data in order to allow the predictive modeling to be as accurate as possible. One category of EHR data that can be used as indicators of ADEs are clinical measurements. However, using clinical measurements as features is not unproblematic due to the high rate of missing values and they can be repeated a variable number of times in each patient health record. In this study, five basic representations of clinical measurements are proposed and evaluated to handle these two problems. An empirical investigation using random forest on 27 datasets from a real EHR database with different ADE targets is presented, demonstrating that the predictive performance, in terms of accuracy and area under ROC curve, is higher when representing clinical measurements crudely as whether they were taken or how many times they were taken by a patient. Furthermore, a sixth alternative, combining all five basic representations, significantly outperforms using any of the basic representation except for one. A subsequent analysis of variable importance is also conducted with this fused feature set, showing that when clinical measurements have a high missing rate, the number of times they were taken by one patient is ranked as more informative than looking at their actual values. The observation from random forest is also confirmed empirically using other commonly employed classifiers. This study demonstrates that the way in which clinical measurements from EHRs are presented has a high impact for ADE detection, and that using multiple representations outperforms using a basic representation.

  • 19.
    Zhao, Jing
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Henriksson, Aron
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Predictive modeling of structured electronic health records for adverse drug event detection2015Inngår i: BMC Medical Informatics and Decision Making, ISSN 1472-6947, E-ISSN 1472-6947, Vol. 15, nr SIArtikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Background: The digitization of healthcare data, resulting from the increasingly widespread adoption of electronic health records, has greatly facilitated its analysis by computational methods and thereby enabled large-scale secondary use thereof. This can be exploited to support public health activities such as pharmacovigilance, wherein the safety of drugs is monitored to inform regulatory decisions about sustained use. To that end, electronic health records have emerged as a potentially valuable data source, providing access to longitudinal observations of patient treatment and drug use. A nascent line of research concerns predictive modeling of healthcare data for the automatic detection of adverse drug events, which presents its own set of challenges: it is not yet clear how to represent the heterogeneous data types in a manner conducive to learning high-performing machine learning models. Methods: Datasets from an electronic health record database are used for learning predictive models with the purpose of detecting adverse drug events. The use and representation of two data types, as well as their combination, are studied: clinical codes, describing prescribed drugs and assigned diagnoses, and measurements. Feature selection is conducted on the various types of data to reduce dimensionality and sparsity, while allowing for an in-depth feature analysis of the usefulness of each data type and representation. Results: Within each data type, combining multiple representations yields better predictive performance compared to using any single representation. The use of clinical codes for adverse drug event detection significantly outperforms the use of measurements; however, there is no significant difference over datasets between using only clinical codes and their combination with measurements. For certain adverse drug events, the combination does, however, outperform using only clinical codes. Feature selection leads to increased predictive performance for both data types, in isolation and combined. Conclusions: We have demonstrated how machine learning can be applied to electronic health records for the purpose of detecting adverse drug events and proposed solutions to some of the challenges this presents, including how to represent the various data types. Overall, clinical codes are more useful than measurements and, in specific cases, it is beneficial to combine the two.

  • 20.
    Zhao, Jing
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Henriksson, Aron
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Kvist, Maria
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap. Karolinska Institute, Sweden.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Handling Temporality of Clinical Events for Drug Safety Surveillance2015Inngår i: AMIA Annual Symposium Proceedings, ISSN 1559-4076, Vol. 2015, 1371-1380 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Using longitudinal data in electronic health records (EHRs) for post-marketing adverse drug event (ADE) detection allows for monitoring patients throughout their medical history. Machine learning methods have been shown to be efficient and effective in screening health records and detecting ADEs. How best to exploit historical data, as encoded by clinical events in EHRs is, however, not very well understood. In this study, three strategies for handling temporality of clinical events are proposed and evaluated using an EHR database from Stockholm, Sweden. The random forest learning algorithm is applied to predict fourteen ADEs using clinical events collected from different lengths of patient history. The results show that, in general, including longer patient history leads to improved predictive performance, and that assigning weights to events according to time distance from the ADE yields the biggest improvement.

  • 21.
    Zhao, Jing
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Karlsson, Isak
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Applying Methods for Signal Detection in Spontaneous Reports to Electronic Patient Records2013Inngår i: Proceedings of the  19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery (ACM), 2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Currently, pharmacovigilance relies mainly on disproportionality analysis of spontaneous reports. However, the analysis of spontaneous reports is concerned with several problems, such as reliability, under-reporting and insucient patient information. Longitudinal healthcare data, such as Electronic Patient Records (EPRs) in which comprehensive information of each patient is covered, is a complementary source of information to detect Adverse Drug Events (ADEs). A wide set of disproportionality methods has been developed for analyzing spontaneous reports to assess the risk of reported events being ADEs. This study aims to investigate the use of such methods for detecting ADEs when analyzing EPRs. The data used in this study was extracted from Stockholm EPR Corpus. Four disproportionality methods (proportional reporting rate, reporting odds ratio, Bayesian condence propagation neural network, and Gamma-Poisson shrinker) were applied in two dierent ways to analyze EPRs: creating pseudo spontaneous reports based on all observed drug-event pairs (event-level analysis) or analyzing distinct patients who experienced a drug-event pair (patient-level analysis). The methods were evaluated in a case study on safety surveillance of Celecoxib. The results showed that, among the top 200 signals, more ADEs were detected by the event-level analysis than by the patient-level analysis. Moreover, the event-level analysis also resulted in a higher mean average precision. The main conclusion of this study is that the way in which the disproportionality analysis is applied, the event-level or patient-level analysis, can have a much higher impact on the performance than which disproportionality method is employed.

  • 22.
    Zhao, Jing
    et al.
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Papapetrou, Panagiotis
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Asker, Lars
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Boström, Henrik
    Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
    Learning from heterogeneous temporal data from electronic health records2017Inngår i: Journal of Biomedical Informatics, ISSN 1532-0464, E-ISSN 1532-0480, Vol. 65, 105-119 s.Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Electronic health records contain large amounts of longitudinal data that are valuable for biomedical informatics research. The application of machine learning is a promising alternative to manual analysis of such data. However, the complex structure of the data, which includes clinical events that are unevenly distributed over time, poses a challenge for standard learning algorithms. Some approaches to modeling temporal data rely on extracting single values from time series; however, this leads to the loss of potentially valuable sequential information. How to better account for the temporality of clinical data, hence, remains an important research question. In this study, novel representations of temporal data in electronic health records are explored. These representations retain the sequential information, and are directly compatible with standard machine learning algorithms. The explored methods are based on symbolic sequence representations of time series data, which are utilized in a number of different ways. An empirical investigation, using 19 datasets comprising clinical measurements observed over time from a real database of electronic health records, shows that using a distance measure to random subsequences leads to substantial improvements in predictive performance compared to using the original sequences or clustering the sequences. Evidence is moreover provided on the quality of the symbolic sequence representation by comparing it to sequences that are generated using domain knowledge by clinical experts. The proposed method creates representations that better account for the temporality of clinical events, which is often key to prediction tasks in the biomedical domain.

1 - 22 of 22
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf