Frequent sequence mining on longitudinaldata: Segregation of Swedish employees
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
This thesis is based on longitudinal data of the Swedish population provided byStatistics Sweden and is conducted on behalf of the Institute for Analytical Sociology.The focus is on investigating the effectiveness of a frequent sequence miningmethod called constrained Sequential PAttern Discovery using Equivalence classes(cSPADE). The method is applied to data on segregation within workplaces, specificallyreasons for Swedish employees moving to more segregated workplaces. Thethesis found that no unique pattern of age, gender, education, unemployment, income,workplace size or foreignness index explain why a Swedish employee movesto a more segregated workplace. Evaluating the algorithm, it was found that thenumber of observations need to be smaller or an alteration of the algorithm needsto be done to reduce the process time for this specific data set.
Place, publisher, year, edition, pages
2015. , 44 p.
Longitudinal data, frequent sequence mining, cSPADE, segregation
Longitudinell, sekvensanalys, segregering, cSPADE
Probability Theory and Statistics
IdentifiersURN: urn:nbn:se:liu:diva-119395ISRN: LIU-IDA/STAT-A–15/004—SEOAI: oai:DiVA.org:liu-119395DiVA: diva2:822012
Institute for Analytical Sociology
Subject / course
2015-06-03, John von Neumann, Linköping, 09:50 (English)
Wänström, Linda, Senior lecturer
Sysoev, Oleg, Senior lecturer