Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classification of Wi-Fi Sensor Data for a Smarter City: Probabilistic Classification using Bayesian Statistics
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

As cities are growing with an increasing number of residents, problems with the traffic such as congestion and larger emission arise. The city planners have challenges with making it as easy as possible for the residents to commute and in as large scale as possible to avoid vehicles. Before any improvements or reconstructions can be made, the traffic situation has to be mapped. The results from a probabilistic classification on Wi-Fi sensor data collected in an area in the southern part of Stockholm showed that some streets are more likely to be trafficked by cyclists than pedestrians while other streets showed the opposite.

The goal of this thesis was to classify observations as either pedestrians or as cyclists. To do that, Bayesian statistics was applied to perform a classification. Results from a cluster analysis performed with K-means algorithm were used as prior information to a probabilistic classification model. To be able to validate the results from this unsupervised statistical learning problem, several model diagnostic methods were used. The final model passes all limits of what is considered to be a stable model and shows clear signs of convergence.

The data was collected using Wi-Fi sensors which detect a device passing by when the device is searching the area for a network to connect to. This thesis will focus on data from three months. Using Wi-Fi sensors as a data collection method makes it possible to track a device. However, many manufacturers produce network interface controllers that generate randomized addresses when the device is connecting to a network, which makes it difficult to track the majority of the devices. Therefore, Wi-Fi sensor data could be seen as not suitable for this type of study. Hence it is suggested that other methods should be used in the future.

Abstract [sv]

I takt med att städer växer med ökat antal invånare uppståar det problem i trafiken såsom trängsel och utsläpp av partiklar. Trafikplanerare ställs inför utmaningar i form av hur de kan underlätta pendling för invånarna och hur de, i så stor utsträckning som möjligt, kan minska fordon i tätorten. Innan potentiella förbättringar och ombyggnationer kan genomföras måste trafiken kartläggas. Resultatet från en sannolikhetsklassificering på Wi-Fi sensordata insamlat i ett område i södra delen av Stockholm visar att vissa gator är mer trafikerade av cyclister än fotgängare medan andra gator visar på motsatt föhållande. Resultatet ger en indikation på hur proportionen mellan de två grupperna kan se ut.

Målet var att klassificera varje observation som antingen fotgängare eller cyklist. För att göra det har Bayesiansk statistik applicerats i form av en sannolikhetsklassifikation. Reslutatet från en klusteranalys genomförd med ”K-means clustering algorithm” användes som prior information till klassificeringsmodellen. För att kunna validera resultatet från detta ”unsupervised statistical learning” -problem, användes olika metoder för modelldiagnostik. Den valda modellen uppfyller alla krav för vad som anses vara rimligt f ̈or en stabil modell och visar tydliga tecken på konvergens.

Data samlades in med Wi-Fi sensorer som upptäcker förbipasserande enheter som söker efter potentiella nätverk att koppla upp sig mot. Denna metod har visat sig inte vara den mest optimala, eftersom tillverkare idag producerar nätverkskort som genererar en slumpad adress varje gång en enhet försöker ansluta till ett nätverk. De slumpade adresserna gör det svårt att följa majoriteten av enheterna mellan sensorera, vilket gör denna typ av data olämplig för denna typ av studie. Därf ̈or föreslås att andra metoder för att samla in data används i framtiden.

Place, publisher, year, edition, pages
2019. , p. 41
National Category
Mathematics
Identifiers
URN: urn:nbn:se:umu:diva-159797OAI: oai:DiVA.org:umu-159797DiVA, id: diva2:1321385
External cooperation
IBM Svenska AB
Educational program
Master of Science in Engineering and Management
Supervisors
Examiners
Available from: 2019-06-10 Created: 2019-06-07 Last updated: 2019-06-10Bibliographically approved

Open Access in DiVA

Master thesis(3892 kB)44 downloads
File information
File name FULLTEXT01.pdfFile size 3892 kBChecksum SHA-512
b57d72948f179136d52122e9992d25fbd0449284f6261f9beb688bdb13115161f1371a9e3965d4bdf56a542d45092304a63440d0807a387f26089b7a549dd2d7
Type fulltextMimetype application/pdf

By organisation
Department of Mathematics and Mathematical Statistics
Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 44 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 123 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf