Digitala Vetenskapliga Arkivet

Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
Refine search result
1234567 1 - 50 of 661
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Abdella, Juhar Ahmed
    et al.
    UAEU, U Arab Emirates.
    Zaki, N. M.
    UAEU, U Arab Emirates.
    Shuaib, Khaled
    UAEU, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Airline ticket price and demand prediction: A survey2021In: Journal of King Saud University - Computer and Information Sciences, ISSN 1319-1578, Vol. 33, no 4, p. 375-391Article in journal (Refereed)
    Abstract [en]

    Nowadays, airline ticket prices can vary dynamically and significantly for the same flight, even for nearby seats within the same cabin. Customers are seeking to get the lowest price while airlines are trying to keep their overall revenue as high as possible and maximize their profit. Airlines use various kinds of computational techniques to increase their revenue such as demand prediction and price discrimination. From the customer side, two kinds of models are proposed by different researchers to save money for customers: models that predict the optimal time to buy a ticket and models that predict the minimum ticket price. In this paper, we present a review of customer side and airlines side prediction models. Our review analysis shows that models on both sides rely on limited set of features such as historical ticket price data, ticket purchase date and departure date. Features extracted from external factors such as social media data and search engine query are not considered. Therefore, we introduce and discuss the concept of using social media data for ticket/demand prediction. (c) 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

    Download full text (pdf)
    fulltext
  • 2.
    Acsintoae, Andra
    et al.
    Univ Bucharest, Romania.
    Florescu, Andrei
    Univ Bucharest, Romania.
    Georgescu, Mariana-Iuliana
    Univ Bucharest, Romania; MBZ Univ Artificial Intelligence, U Arab Emirates; SecurifAI, Romania.
    Mare, Tudor
    SecurifAI, Romania.
    Sumedrea, Paul
    Univ Bucharest, Romania.
    Ionescu, Radu Tudor
    Univ Bucharest, Romania; SecurifAI, Romania.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZ Univ Artificial Intelligence, U Arab Emirates.
    Shah, Mubarak
    Univ Cent Florida, FL 32816 USA.
    UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection2022In: 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), IEEE COMPUTER SOC , 2022, p. 20111-20121Conference paper (Refereed)
    Abstract [en]

    Detecting abnormal events in video is commonly framed as a one-class classification task, where training videos contain only normal events, while test videos encompass both normal and abnormal events. In this scenario, anomaly detection is an open-set problem. However, some studies assimilate anomaly detection to action recognition. This is a closed-set scenario that fails to test the capability of systems at detecting new anomaly types. To this end, we propose UBnormal, a new supervised open-set benchmark composed of multiple virtual scenes for video anomaly detection. Unlike existing data sets, we introduce abnormal events annotated at the pixel level at training time, for the first time enabling the use of fully-supervised learning methods for abnormal event detection. To preserve the typical open-set formulation, we make sure to include dis-joint sets of anomaly types in our training and test collections of videos. To our knowledge, UBnormal is the first video anomaly detection benchmark to allow a fair head-to-head comparison between one-class open-set models and supervised closed-set models, as shown in our experiments. Moreover, we provide empirical evidence showing that UB-normal can enhance the performance of a state-of-the-art anomaly detection framework on two prominent data sets, Avenue and ShanghaiTech. Our benchmark is freely available at https://github.com/lilygeorgescu/UBnormal.

  • 3.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Arsic, Dejan
    Munich University of Technology, Germany.
    Ganchev, Todor
    University of Patras, Greece.
    Linderhed, Anna
    FOI Swedish Defence Research Agency.
    Menezes, Paolo
    University of Coimbra, Portugal.
    Ntalampiras, Stavros
    University of Patras, Greece.
    Olma, Tadeusz
    MARAC S.A., Greece.
    Potamitis, Ilyas
    Technological Educational Institute of Crete, Greece.
    Ros, Julien
    Probayes SAS, France.
    Prometheus: Prediction and interpretation of human behaviour based on probabilistic structures and heterogeneous sensors2008Conference paper (Refereed)
    Abstract [en]

    The on-going EU funded project Prometheus (FP7-214901) aims at establishing a general framework which links fundamental sensing tasks to automated cognition processes enabling interpretation and short-term prediction of individual and collective human behaviours in unrestricted environments as well as complex human interactions. To achieve the aforementioned goals, the Prometheus consortium works on the following core scientific and technological objectives:

    1. sensor modeling and information fusion from multiple, heterogeneous perceptual modalities;

    2. modeling, localization, and tracking of multiple people;

    3. modeling, recognition, and short-term prediction of continuous complex human behavior.

    Download full text (pdf)
    fulltext
  • 4.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Evaluating Template Rescaling in Short-Term Single-Object Tracking2015Conference paper (Refereed)
    Abstract [en]

    In recent years, short-term single-object tracking has emerged has a popular research topic, as it constitutes the core of more general tracking systems. Many such tracking methods are based on matching a part of the image with a template that is learnt online and represented by, for example, a correlation filter or a distribution field. In order for such a tracker to be able to not only find the position, but also the scale, of the tracked object in the next frame, some kind of scale estimation step is needed. This step is sometimes separate from the position estimation step, but is nevertheless jointly evaluated in de facto benchmarks. However, for practical as well as scientific reasons, the scale estimation step should be evaluated separately – for example,theremightincertainsituationsbeothermethodsmore suitable for the task. In this paper, we describe an evaluation method for scale estimation in template-based short-term single-object tracking, and evaluate two state-of-the-art tracking methods where estimation of scale and position are separable.

    Download full text (pdf)
    fulltext
  • 5.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Glana Sensors AB, Sweden.
    Renhorn, Ingmar
    Glana Sensors AB, Sweden.
    Chevalier, Tomas
    Scienvisic AB, Sweden.
    Rydell, Joakim
    FOI, Swedish Defence Research Agency, Sweden.
    Bergström, David
    FOI, Swedish Defence Research Agency, Sweden.
    Three-dimensional hyperspectral imaging technique2017In: ALGORITHMS AND TECHNOLOGIES FOR MULTISPECTRAL, HYPERSPECTRAL, AND ULTRASPECTRAL IMAGERY XXIII / [ed] Miguel Velez-Reyes; David W. Messinger, SPIE - International Society for Optical Engineering, 2017, Vol. 10198, article id 1019805Conference paper (Refereed)
    Abstract [en]

    Hyperspectral remote sensing based on unmanned airborne vehicles is a field increasing in importance. The combined functionality of simultaneous hyperspectral and geometric modeling is less developed. A configuration has been developed that enables the reconstruction of the hyperspectral three-dimensional (3D) environment. The hyperspectral camera is based on a linear variable filter and a high frame rate, high resolution camera enabling point-to-point matching and 3D reconstruction. This allows the information to be combined into a single and complete 3D hyperspectral model. In this paper, we describe the camera and illustrate capabilities and difficulties through real-world experiments.

    Download full text (pdf)
    fulltext
  • 6.
    Ahlberg, Jörgen
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Åstrom, Anders
    Swedish Natl Forens Ctr NFC, Linkoping, Sweden.
    Forchheimer, Robert
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, Faculty of Science & Engineering.
    Simultaneous sensing, readout, and classification on an intensity-ranking image sensor2018In: International journal of circuit theory and applications, ISSN 0098-9886, E-ISSN 1097-007X, Vol. 46, no 9, p. 1606-1619Article in journal (Refereed)
    Abstract [en]

    We combine the near-sensor image processing concept with address-event representation leading to an intensity-ranking image sensor (IRIS) and show the benefits of using this type of sensor for image classification. The functionality of IRIS is to output pixel coordinates (X and Y values) continuously as each pixel has collected a certain number of photons. Thus, the pixel outputs will be automatically intensity ranked. By keeping track of the timing of these events, it is possible to record the full dynamic range of the image. However, in many cases, this is not necessary-the intensity ranking in itself gives the needed information for the task at hand. This paper describes techniques for classification and proposes a particular variant (groves) that fits the IRIS architecture well as it can work on the intensity rankings only. Simulation results using the CIFAR-10 dataset compare the results of the proposed method with the more conventional ferns technique. It is concluded that the simultaneous sensing and classification obtainable with the IRIS sensor yields both fast (shorter than full exposure time) and processing-efficient classification.

    Download full text (pdf)
    fulltext
  • 7.
    Ahlman, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Improved Temporal Resolution Using Parallel Imaging in Radial-Cartesian 3D functional MRI2011Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    MRI (Magnetic Resonance Imaging) is a medical imaging method that uses magnetic fields in order to retrieve images of the human body. This thesis revolves around a novel acquisition method of 3D fMRI (functional Magnetic Resonance Imaging) called PRESTO-CAN that uses a radial pattern in order to sample the (kx,kz)-plane of k-space (the frequency domain), and a Cartesian sample pattern in the ky-direction. The radial sample pattern allows for a denser sampling of the central parts of k-space, which contain the most basic frequency information about the structure of the recorded object. This allows for higher temporal resolution to be achieved compared with other sampling methods since a fewer amount of total samples are needed in order to retrieve enough information about how the object has changed over time. Since fMRI is mainly used for monitoring blood flow in the brain, increased temporal resolution means that we can be able to track fast changes in brain activity more efficiently.The temporal resolution can be further improved by reducing the time needed for scanning, which in turn can be achieved by applying parallel imaging. One such parallel imaging method is SENSE (SENSitivity Encoding). The scan time is reduced by decreasing the sampling density, which causes aliasing in the recorded images. The aliasing is removed by the SENSE method by utilizing the extra information provided by the fact that multiple receiver coils with differing sensitivities are used during the acquisition. By measuring the sensitivities of the respective receiver coils and solving an equation system with the aliased images, it is possible to calculate how they would have looked like without aliasing.In this master thesis, SENSE has been successfully implemented in PRESTO-CAN. By using normalized convolution in order to refine the sensitivity maps of the receiver coils, images with satisfying quality was able to be reconstructed when reducing the k-space sample rate by a factor of 2, and images of relatively good quality also when the sample rate was reduced by a factor of 4. In this way, this thesis has been able to contribute to the improvement of the temporal resolution of the PRESTO-CAN method.

    Download full text (pdf)
    Gustav_Ahlman_Examensarbete_SENSE
  • 8.
    Ahlman, Gustav
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Magnusson, Maria
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Dahlqvist Leinhard, Olof
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Faculty of Health Sciences.
    Lundberg, Peter
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Medical and Health Sciences, Radiation Physics. Linköping University, Department of Medical and Health Sciences, Radiology. Linköping University, Faculty of Health Sciences. Östergötlands Läns Landsting, Center for Surgery, Orthopaedics and Cancer Treatment, Department of Radiation Physics. Östergötlands Läns Landsting, Center for Diagnostics, Department of Radiology in Linköping.
    Increased temporal resolution in radial-Cartesian sampling of k-space by implementation of parallel imaging2011Conference paper (Refereed)
  • 9.
    Ahlqvist, Axel
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Examining Difficulties in Weed Detection2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Automatic detection of weeds could be used for more efficient weed control in agriculture. In this master thesis, weed detectors have been trained and examined on data collected by RISE to investigate whether an accurate weed detector could be trained on the collected data. When only using annotations of the weed class Creeping thistle for training and evaluation, a detector achieved a mAP of 0.33. When using four classes of weed, a detector was trained with a mAP of 0.07. The performance was worse than in a previous study also dealing with weed detection. Hypotheses for why the performance was lacking were examined. Experiments indicated that the problem could not fully be explained by the model being underfitted, nor by the object’s backgrounds being too similar to the foreground, nor by the quality of the annotations being too low. The performance was better when training the model with as much data as possible than when only selected segments of the data were used.

    Download full text (pdf)
    fulltext
  • 10.
    Ajisafe, Daniel
    et al.
    Univ British Columbia, Canada.
    Tang, James
    Univ British Columbia, Canada.
    Su, Shih-Yang
    Univ British Columbia, Canada.
    Wandt, Bastian
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Univ British Columbia, Canada.
    Rhodin, Helge
    Univ British Columbia, Canada.
    Mirror-Aware Neural Humans2024In: 2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, IEEE COMPUTER SOC , 2024, p. 1092-1102Conference paper (Refereed)
    Abstract [en]

    Human motion capture either requires multi-camera systems or is unreliable when using single-view input due to depth ambiguities. Meanwhile, mirrors are readily available in urban environments and form an affordable alternative by recording two views with only a single camera. However, the mirror setting poses the additional challenge of handling occlusions of real and mirror image. Going beyond existing mirror approaches for 3D human pose estimation, we utilize mirrors for learning a complete body model, including shape and dense appearance. Our main contributions are extending articulated neural radiance fields to include a notion of a mirror, making it sample-efficient over potential occlusion regions. Together, our contributions realize a consumer-level 3D motion capture system that starts from off-the-shelf 2D poses by automatically calibrating the camera, estimating mirror orientation, and subsequently lifting 2D keypoint detections to 3D skeleton pose that is used to condition the mirror-aware NeRF. We empirically demonstrate the benefit of learning a body model and accounting for occlusion in challenging mirror scenes. The project is available at: https:// danielajisafe. github.io/mirror- aware- neural- humans/.

  • 11.
    Al Khatib, Salwa
    et al.
    Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
    Boudjoghra, Mohamed El Amine
    Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
    Lahoud, Jean
    Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed Bin Zayed Univ Artificial Intelligence MB, U Arab Emirates.
    3D Instance Segmentation via Enhanced Spatial and Semantic Supervision2023In: 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, IEEE COMPUTER SOC , 2023, p. 541-550Conference paper (Refereed)
    Abstract [en]

    3D instance segmentation has recently garnered increased attention. Typical deep learning methods adopt point grouping schemes followed by hand-designed geometric clustering. Inspired by the success of transformers for various 3D tasks, newer hybrid approaches have utilized transformer decoders coupled with convolutional backbones that operate on voxelized scenes. However, due to the nature of sparse feature backbones, the extracted features provided to the transformer decoder are lacking in spatial understanding. Thus, such approaches often predict spatially separate objects as single instances. To this end, we introduce a novel approach for 3D point clouds instance segmentation that addresses the challenge of generating distinct instance masks for objects that share similar appearances but are spatially separated. Our method leverages spatial and semantic supervision with query refinement to improve the performance of hybrid 3D instance segmentation models. Specifically, we provide the transformer block with spatial features to facilitate differentiation between similar object queries and incorporate semantic supervision to enhance prediction accuracy based on object class. Our proposed approach outperforms existing methods on the validation sets of ScanNet V2 and ScanNet200 datasets, establishing a new state-of-the-art for this task.

  • 12.
    Alghallabi, Wafa
    et al.
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Dudhane, Akshay
    Mohamed bin Zayed Univ AI, U Arab Emirates.
    Zamir, Waqas
    Incept Inst AI, U Arab Emirates.
    Khan, Salman
    Mohamed bin Zayed Univ AI, U Arab Emirates; Australian Natl Univ, Australia.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed bin Zayed Univ AI, U Arab Emirates.
    Accelerated MRI Reconstruction via Dynamic Deformable Alignment Based Transformer2024In: MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT I, SPRINGER INTERNATIONAL PUBLISHING AG , 2024, Vol. 14348, p. 104-114Conference paper (Refereed)
    Abstract [en]

    Magnetic resonance imaging (MRI) is a slow diagnostic technique due to its time-consuming acquisition speed. To address this, parallel imaging and compressed sensing methods were developed. Parallel imaging acquires multiple anatomy views simultaneously, while compressed sensing acquires fewer samples than traditional methods. However, reconstructing images from undersampled multi-coil data remains challenging. Existing methods concatenate input slices and adjacent slices along the channel dimension to gather more information for MRI reconstruction. Implicit feature alignment within adjacent slices is crucial for optimal reconstruction performance. Hence, we propose MFormer: an accelerated MRI reconstruction transformer with cascading MFormer blocks containing multi-scale Dynamic Deformable Swin Transformer (DST) modules. Unlike other methods, our DST modules implicitly align adjacent slice features using dynamic deformable convolution and extract local non-local features before merging information. We adapt input variations by aggregating deformable convolution kernel weights and biases through a dynamic weight predictor. Extensive experiments on Stanford2D, Stanford3D, and large-scale FastMRI datasets show the merits of our contributions, achieving state-of-the-art MRI reconstruction performance. Our code and models are available at https://github.com/wafaAlghallabi/MFomer.

  • 13.
    Almin, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Detection of Non-Ferrous Materials with Computer Vision2020Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In one of the facilities at the Stena Recycling plant in Halmstad, Sweden, about 300 tonnes of metallic waste is processed each day with the aim of sorting out all non-ferrous material. At the end of this process, non-ferrous materials are

    manually sorted out from the ferrous materials. This thesis investigates a computer vision based approach to identify and localize the non-ferrous materials

    and eventually automate the sorting.Images were captured of ferrous and non-ferrous materials. The images areprocessed and segmented to be used as annotation data for a deep convolutionalneural segmentation network. Network models have been trained on different

    kinds and amounts of data. The resulting models are evaluated and tested in ac-cordance with different evaluation metrics. Methods of creating advanced train-ing data by merging imaging information were tested. Experiments with using

    classifier prediction confidence to identify objects of unknown classes were per-formed.

    This thesis shows that it is possible to discern ferrous from non-ferrous mate-rial with a purely vision based system. The thesis also shows that it is possible to

    automatically create annotated training data. It becomes evident that it is possi-ble to create better training data, tailored for the task at hand, by merging image

    data. A segmentation network trained on more than two classes yields lowerprediction confidence for objects unknown to the classifier.Substituting manual sorting with a purely vision based system seems like aviable approach. Before a substitution is considered, the automatic system needsto be evaluated in comparison to the manual sorting.

    Download full text (pdf)
    Detection of Non-Ferrous Materials with Computer Vision
  • 14.
    Andersson, Elin
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Thermal Impact of a Calibrated Stereo Camera Rig2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Measurements performed from stereo reconstruction can be obtained with a high accuracy with correct calibrated cameras. A stereo camera rig mounted in an outdoor environment is exposed to temperature changes, which has an impact of the calibration of the cameras.

    The aim of the master thesis was to investigate the thermal impact of a calibrated stereo camera rig. This was performed by placing a stereo rig in a temperature chamber and collect data of a calibration board at different temperatures. Data was collected with two different cameras and lensesand used for calibration of the stereo camera rig for different scenarios. The obtained parameters were plotted and analyzed.

    The result from the master thesis gives that the thermal variation has an impact of the accuracy of the calibrated stereo camera rig. A calibration obtained in one temperature can not be used for a different temperature without a degradation of the accuracy. The plotted parameters from the calibration had a high noise level due to problems with the calibration methods, and no visible trend from temperature changes could be seen.

    Download full text (pdf)
    fulltext
  • 15.
    Andersson, Maria
    et al.
    FOI Swedish Defence Research Agency.
    Rydell, Joakim
    FOI Swedish Defence Research Agency.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. FOI Swedish Defence Research Agency.
    Estimation of crowd behaviour using sensor networks and sensor fusion2009Conference paper (Refereed)
    Abstract [en]

    Commonly, surveillance operators are today monitoring a large number of CCTV screens, trying to solve the complex cognitive tasks of analyzing crowd behavior and detecting threats and other abnormal behavior. Information overload is a rule rather than an exception. Moreover, CCTV footage lacks important indicators revealing certain threats, and can also in other respects be complemented by data from other sensors. This article presents an approach to automatically interpret sensor data and estimate behaviors of groups of people in order to provide the operator with relevant warnings. We use data from distributed heterogeneous sensors (visual cameras and a thermal infrared camera), and process the sensor data using detection algorithms. The extracted features are fed into a hidden Markov model in order to model normal behavior and detect deviations. We also discuss the use of radars for weapon detection.

  • 16.
    Andersson, Viktor
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Semantic Segmentation: Using Convolutional Neural Networks and Sparse dictionaries2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    The two main bottlenecks using deep neural networks are data dependency and training time. This thesis proposes a novel method for weight initialization of the convolutional layers in a convolutional neural network. This thesis introduces the usage of sparse dictionaries. A sparse dictionary optimized on domain specific data can be seen as a set of intelligent feature extracting filters. This thesis investigates the effect of using such filters as kernels in the convolutional layers in the neural network. How do they affect the training time and final performance?

    The dataset used here is the Cityscapes-dataset which is a library of 25000 labeled road scene images.The sparse dictionary was acquired using the K-SVD method. The filters were added to two different networks whose performance was tested individually. One of the architectures is much deeper than the other. The results have been presented for both networks. The results show that filter initialization is an important aspect which should be taken into consideration while training the deep networks for semantic segmentation.

    Download full text (pdf)
    fulltext
  • 17.
    Antonsson, Per
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Johansson, Jesper
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Measuring Respiratory Frequency Using Optronics and Computer Vision2021Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This thesis investigates the development and use of software to measure respiratory frequency on cows using optronics and computer vision. It examines mainly two different strategies of image and signal processing and their performances for different input qualities. The effect of heat stress on dairy cows and the high transmission risk of pneumonia for calves make the investigation done during this thesis highly relevant since they both have the same symptom; increased respiratory frequency. The data set used in this thesis was of recorded dairy cows in different environments and from varying angles. Recordings, where the authors could determine a true breathing frequency by monitoring body movements, were accepted to the data set and used to test and develop the algorithms. One method developed in this thesis estimated the breathing rate in the frequency domain by Fast Fourier Transform and was named "N-point Fast Fourier Transform." The other method was called "Breathing Movement Zero-Crossing Counting." It estimated a signal in the time domain, whose fundamental frequency was determined by a zero-crossing algorithm as the breathing frequency. The result showed that both the developed algorithm successfully estimated a breathing frequency with a reasonable error margin for most of the data set. The zero-crossing algorithm showed the most consistent result with an error margin lower than 0.92 breaths per minute (BPM) for twelve of thirteen recordings. However, it is limited to recordings where the camera is placed above the cow. The N-point FFT algorithm estimated the breathing frequency with error margins between 0.44 and 5.20 BPM for the same recordings as the zero-crossing algorithm. This method is not limited to a specific camera angle but requires the cow to be relatively stationary to get accurate results. Therefore, it could be evaluated with the remaining three recordings of the data set. The error margins for these recordings were measured between 1.92 and 10.88 BPM. Both methods had execution time acceptable for implementation in real-time. It was, however, too incomplete a data set to determine any performance with recordings from different optronic devices. 

    Download full text (pdf)
    fulltext
  • 18.
    Anwar, Saeed
    et al.
    Australian Natl Univ, Australia; Univ Canberra, Australia.
    Tahir, Muhammad
    Natl Univ Sci & Technol NUST, Pakistan.
    Li, Chongyi
    Nankai Univ, Peoples R China.
    Mian, Ajmal
    Univ Western Australia, Australia.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Mohamed Bin Zayed Univ Artificial Intelligence, U Arab Emirates.
    Muzaffar, Abdul Wahab
    Saudi Elect Univ, Saudi Arabia.
    Image colorization: A survey and dataset2025In: Information Fusion, ISSN 1566-2535, E-ISSN 1872-6305, Vol. 114, article id 102720Article in journal (Refereed)
    Abstract [en]

    Image colorization estimates RGB colors for grayscale images or video frames to improve their aesthetic and perceptual quality. Over the last decade, deep learning techniques for image colorization have significantly progressed, necessitating a systematic survey and benchmarking of these techniques. This article presents a comprehensive survey of recent state-of-the-art deep learning-based image colorization techniques, describing their fundamental block architectures, inputs, optimizers, loss functions, training protocols, training data, etc. It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance, such as benchmark datasets and evaluation metrics. We highlight the limitations of existing datasets and introduce a new dataset specific to colorization. We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one. Finally, we discuss the limitations of existing methods and recommend possible solutions and future research directions for this rapidly evolving topic of deep image colorization. The dataset and codes for evaluation are publicly available at https://github.com/saeed-anwar/ColorSurvey.

  • 19.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Two-Stream Part-based Deep Representation for Human Attribute Recognition2018In: 2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), IEEE , 2018, p. 90-97Conference paper (Refereed)
    Abstract [en]

    Recognizing human attributes in unconstrained environments is a challenging computer vision problem. State-of-the-art approaches to human attribute recognition are based on convolutional neural networks (CNNs). The de facto practice when training these CNNs on a large labeled image dataset is to take RGB pixel values of an image as input to the network. In this work, we propose a two-stream part-based deep representation for human attribute classification. Besides the standard RGB stream, we train a deep network by using mapped coded images with explicit texture information, that complements the standard RGB deep model. To integrate human body parts knowledge, we employ the deformable part-based models together with our two-stream deep model. Experiments are performed on the challenging Human Attributes (HAT-27) Dataset consisting of 27 different human attributes. Our results clearly show that (a) the two-stream deep network provides consistent gain in performance over the standard RGB model and (b) that the attribute classification results are further improved with our two-stream part-based deep representations, leading to state-of-the-art results.

  • 20.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland; Incept Inst Artificial Intelligence, U Arab Emirates.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Incept Inst Artificial Intelligence, U Arab Emirates.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Zaki, Nazar
    United Arab Emirates Univ, U Arab Emirates.
    Multi-stream Convolutional Networks for Indoor Scene Recognition2019In: COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2019, PT I, SPRINGER INTERNATIONAL PUBLISHING AG , 2019, Vol. 11678, p. 196-208Conference paper (Refereed)
    Abstract [en]

    Convolutional neural networks (CNNs) have recently achieved outstanding results for various vision tasks, including indoor scene understanding. The de facto practice employed by state-of-the-art indoor scene recognition approaches is to use RGB pixel values as input to CNN models that are trained on large amounts of labeled data (Image-Net or Places). Here, we investigate CNN architectures by augmenting RGB images with estimated depth and texture information, as multiple streams, for monocular indoor scene recognition. First, we exploit the recent advancements in the field of depth estimation from monocular images and use the estimated depth information to train a CNN model for learning deep depth features. Second, we train a CNN model to exploit the successful Local Binary Patterns (LBP) by using mapped coded images with explicit LBP encoding to capture texture information available in indoor scenes. We further investigate different fusion strategies to combine the learned deep depth and texture streams with the traditional RGB stream. Comprehensive experiments are performed on three indoor scene classification benchmarks: MIT-67, OCIS and SUN-397. The proposed multi-stream network significantly outperforms the standard RGB network by achieving an absolute gain of 9.3%, 4.7%, 7.3% on the MIT-67, OCIS and SUN-397 datasets respectively.

  • 21.
    Anwer, Rao Muhammad
    et al.
    Aalto Univ, Finland.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    van de Weijer, Joost
    Univ Autonoma Barcelona, Spain.
    Molinier, Matthieu
    VTT Tech Res Ctr Finland Ltd, Finland.
    Laaksonen, Jorma
    Aalto Univ, Finland.
    Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification2018In: ISPRS journal of photogrammetry and remote sensing (Print), ISSN 0924-2716, E-ISSN 1872-8235, Vol. 138, p. 74-85Article in journal (Refereed)
    Abstract [en]

    Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification. (C) 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

  • 22.
    Ardeshiri, Tohid
    et al.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Larsson, Fredrik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Gustafsson, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Schön, Thomas B.
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, The Institute of Technology.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Bicycle Tracking Using Ellipse Extraction2011In: Proceedings of the 14thInternational Conference on Information Fusion, 2011, IEEE , 2011, p. 1-8Conference paper (Refereed)
    Abstract [en]

    A new approach to track bicycles from imagery sensor data is proposed. It is based on detecting ellipsoids in the images, and treat these pair-wise using a dynamic bicycle model. One important application area is in automotive collision avoidance systems, where no dedicated systems for bicyclists yet exist and where very few theoretical studies have been published.

    Possible conflicts can be predicted from the position and velocity state in the model, but also from the steering wheel articulation and roll angle that indicate yaw changes before the velocity vector changes. An algorithm is proposed which consists of an ellipsoid detection and estimation algorithm and a particle filter.

    A simulation study of three critical single target scenarios is presented, and the algorithm is shown to produce excellent state estimates. An experiment using a stationary camera and the particle filter for state estimation is performed and has shown encouraging results.

  • 23.
    Asplund, Hugo
    Linköping University, Department of Electrical Engineering, Computer Vision.
    2.5D Feature Based Correspondence Matching for Part Localization2024Independent thesis Advanced level (degree of Master (Two Years)), 28 HE creditsStudent thesis
    Abstract [en]

    In the area of automation, object localization stands as a fundamental functionalitywith widespread applicability. This master’s thesis focuses on a specificapplication involving robot object picking. Given recent advancements in depthcamera technology, there is a high interest in exploring the synergistic integrationof both 2D and 3D data to address challenges such as missing data, occlusion,varying viewing angles, and diverse lighting conditions.

    This master’s thesis presents the development of two distinct algorithms for arbitraryshaped template matching using 2D image features. Both algorithms leveragefeatures detected by the GoodFeaturesToTrack algorithm and described withScale-invariant feature transform (SIFT) descriptors. While an initial sliding windowmatcher was developed, it was ultimately discarded due to extensive timerequirements. Instead, a correspondence matcher was created, offering two variations:one exclusively employing 2D image data for matching and another utilizing3D coordinates to enhance matching accuracy. The correspondence matchingalgorithms showed similar strengths and weaknesses. They demonstrated proficiencyin handling scenarios characterized by occlusion, minor tilt, and varyingscaling. Both variations struggled with objects 90-degrees rotated and could inmany cases not find them.

    The findings suggest that the developed feature-based correspondence matchingalgorithm holds promise for object localization in industrial picking applications,although with limitations concerning objects with substantial rotationdifferences. Addressing the challenge of large rotations is recommended for enhancingthe algorithm’s robustness, followed by comprehensive testing to ascertainits efficacy in diverse scenarios.iii

    Download full text (pdf)
    2.5D_Feature_Based_Correspondence_Matching_for_Part_Localization
  • 24.
    Bai, Tao
    et al.
    Nanyang Technol Univ, Singapore.
    Cao, Yiming
    Singapore Management Univ, Singapore.
    Xu, Yonghao
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Wen, Bihan
    Nanyang Technol Univ, Singapore.
    Stealthy Adversarial Examples for Semantic Segmentation in Remote Sensing2024In: IEEE Transactions on Geoscience and Remote Sensing, ISSN 0196-2892, E-ISSN 1558-0644, Vol. 62, article id 5614817Article in journal (Refereed)
    Abstract [en]

    Deep learning methods have been proven effective in remote sensing image analysis and interpretation, where semantic segmentation plays a vital role. These deep segmentation methods are susceptible to adversarial attacks, while most of the existing attack methods tend to manipulate the image globally, leading to noticeable perturbations and chaotic segmentation. In this work, we propose a novel stealthy attack for semantic segmentation (SASS), which can largely increase the effectiveness and stealthiness from the existing attack methods on remote sensing images. SASS manipulates specific victim classes or objects of interest while preserving the original segmentation results for other classes or objects. In practice, as different inference mechanisms, overlapped inference, can be applied in segmentation, the efficacy of SASS may be degraded. To this end, we further introduce the masked SASS (MSASS), which generates augmented adversarial perturbations that only affect victim areas. We evaluate the effectiveness of SASS and MSASS using four state-of-the-art semantic segmentation models on the Vaihingen and Zurich Summer datasets. Extensive experiments demonstrate that our SASS and MSASS methods achieve superior attack performances on victim areas while maintaining high accuracies of other areas (drop less than 2%). The detection success rates of adversarial examples for segmentation, as characterized by Xiao et al., significantly drop from 97.78% for the untargeted projected gradient descent (PGD) attack to 28.71% for our MSASS method on the Zurich Summer dataset. Our work contributes to the field of adversarial attacks in semantic segmentation for remote sensing images by improving stealthiness, flexibility, and robustness. We anticipate that our findings will inspire the development of defense methods to enhance the security and reliability of semantic segmentation models against our stealthy attack.

  • 25.
    Baravdish, George
    et al.
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Svensson, Olof
    Linköping University, Department of Science and Technology, Communications and Transport Systems. Linköping University, The Institute of Technology.
    Åström, Freddie
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    On Backward p(x)-Parabolic Equations for Image Enhancement2015In: Numerical Functional Analysis and Optimization, ISSN 0163-0563, E-ISSN 1532-2467, Vol. 36, no 2, p. 147-168Article in journal (Refereed)
    Abstract [en]

    In this study, we investigate the backward p(x)-parabolic equation as a new methodology to enhance images. We propose a novel iterative regularization procedure for the backward p(x)-parabolic equation based on the nonlinear Landweber method for inverse problems. The proposed scheme can also be extended to the family of iterative regularization methods involving the nonlinear Landweber method. We also investigate the connection between the variable exponent p(x) in the proposed energy functional and the diffusivity function in the corresponding Euler-Lagrange equation. It is well known that the forward problems converges to a constant solution destroying the image. The purpose of the approach of the backward problems is twofold. First, solving the backward problem by a sequence of forward problems we obtain a smooth image which is denoised. Second, by choosing the initial data properly we try to reduce the blurriness of the image. The numerical results for denoising appear to give improvement over standard methods as shown by preliminary results.

  • 26.
    Barbalau, Antonio
    et al.
    Univ Bucharest, Romania.
    Ionescu, Radu Tudor
    Univ Bucharest, Romania; SecurifAI, Romania; MBZ Univ Artificial Intelligence, U Arab Emirates.
    Georgescu, Mariana-Iuliana
    Univ Bucharest, Romania; SecurifAI, Romania.
    Dueholm, Jacob
    Aalborg Univ, Denmark; Milestone Syst, Denmark.
    Ramachandra, Bharathkumar
    Geopipe Inc, NY 10019 USA.
    Nasrollahi, Kamal
    Aalborg Univ, Denmark; Milestone Syst, Denmark.
    Khan, Fahad
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. MBZ Univ Artificial Intelligence, U Arab Emirates.
    Moeslund, Thomas B.
    Aalborg Univ, Denmark.
    Shah, Mubarak
    Univ Cent Florida, FL 32816 USA.
    SSMTL plus plus : Revisiting self-supervised multi-task learning for video anomaly detection2023In: Computer Vision and Image Understanding, ISSN 1077-3142, E-ISSN 1090-235X, Vol. 229, article id 103656Article in journal (Refereed)
    Abstract [en]

    A self-supervised multi-task learning (SSMTL) framework for video anomaly detection was recently introduced in literature. Due to its highly accurate results, the method attracted the attention of many researchers. In this work, we revisit the self-supervised multi-task learning framework, proposing several updates to the original method. First, we study various detection methods, e.g. based on detecting high-motion regions using optical flow or background subtraction, since we believe the currently used pre-trained YOLOv3 is suboptimal, e.g. objects in motion or objects from unknown classes are never detected. Second, we modernize the 3D convolutional backbone by introducing multi-head self-attention modules, inspired by the recent success of vision transformers. As such, we alternatively introduce both 2D and 3D convolutional vision transformer (CvT) blocks. Third, in our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps through knowledge distillation, solving jigsaw puzzles, estimating body pose through knowledge distillation, predicting masked regions (inpainting), and adversarial learning with pseudo-anomalies. We conduct experiments to assess the performance impact of the introduced changes. Upon finding more promising configurations of the framework, dubbed SSMTL++v1 and SSMTL++v2, we extend our preliminary experiments to more data sets, demonstrating that our performance gains are consistent across all data sets. In most cases, our results on Avenue, ShanghaiTech and UBnormal raise the state-of-the-art performance bar to a new level.

  • 27.
    Barberi, Emmanuele
    et al.
    Università degli Studi di Messina, Italy.
    Cucinotta, Filippo
    Università degli Studi di Messina, Italy.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Raffaele, Marcello
    Università degli Studi di Messina, Italy.
    Salmeri, Fabio
    Università degli Studi di Messina, Italy.
    A differential entropy-based method for reverse engineering quality assessment2023Conference paper (Other academic)
    Abstract [en]

    The present work proposes the use of point clouds differential entropy as a method for reverse engineering quality assessment. This quality assessment can be used to measure the deviation of objects made with additive manufacturing or CNC techniques. The quality of the execution is intended as a measure of the deviation of the geometry of the obtained object compared to the original CAD. This paper proposes the use of the quality index of the CorAl method to assess the quality of an objects compared to its original CAD. This index, based on the differential entropy, takes on a value the closer to 0 the more they obtained object is close to the original geometry. The advantage of this method is to have a global synthetic index. It is however possible to have entropy maps of the individual points to verify which are the areas with the greatest deviation. The method is robust for comparing point clouds at different densities. Objects obtained by additive manufacturing with different print qualities were used. The quality index evaluated for each object, as defined in the CorAl method, turns out to be gradually closer to 0 as the quality of the piece's construction increases.

  • 28.
    Barberi, Emmanuele
    et al.
    Univ Messina, Italy.
    Cucinotta, Filippo
    Univ Messina, Italy.
    Forssén, Per-Erik
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Raffaele, Marcello
    Univ Messina, Italy.
    Salmeri, Fabio
    Univ Messina, Italy.
    A Differential Entropy-Based Method for Reverse Engineering Quality Assessment2024In: DESIGN TOOLS AND METHODS IN INDUSTRIAL ENGINEERING III, VOL 2, ADM 2023, SPRINGER INTERNATIONAL PUBLISHING AG , 2024, p. 451-458Conference paper (Refereed)
    Abstract [en]

    The present work proposes the use of point clouds differential entropy as a new method for reverse engineering quality assessment. This quality assessment can be used to measure the shape deviation of objects made with additive manufacturing or CNC techniques. The quality of the execution is intended as a measure of the deviation of the geometry of the obtained object compared to the original CAD model. The method exploits the quality index of the CorAl method. The advantages of CorAl are several, among them the use of a unique index of comparison, no problem of commutativity of the comparison, noise immunity, low influence of the presence of holes and of the point cloud densities. It is possible to plot quality maps showing the areas with the greatest deviation. In experiments, objects obtained by additive manufacturingwith different print qualities are tested. The robustness of the method is also demonstrated. The quality index evaluated for each object, as defined in the CorAl method, turns out to be gradually closer to 0 as the quality of the piece's construction increases.

  • 29.
    Barnada, Marc
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Conrad, Christian
    Goethe University of Frankfurt, Germany.
    Bradler, Henry
    Goethe University of Frankfurt, Germany.
    Ochs, Matthias
    Goethe University of Frankfurt, Germany.
    Mester, Rudolf
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Goethe University of Frankfurt, Germany.
    Estimation of Automotive Pitch, Yaw, and Roll using Enhanced Phase Correlation on Multiple Far-field Windows2015In: 2015 IEEE Intelligent Vehicles Symposium (IV), IEEE , 2015, p. 481-486Conference paper (Refereed)
    Abstract [en]

    The online-estimation of yaw, pitch, and roll of a moving vehicle is an important ingredient for systems which estimate egomotion, and 3D structure of the environment in a moving vehicle from video information. We present an approach to estimate these angular changes from monocular visual data, based on the fact that the motion of far distant points is not dependent on translation, but only on the current rotation of the camera. The presented approach does not require features (corners, edges,...) to be extracted. It allows to estimate in parallel also the illumination changes from frame to frame, and thus allows to largely stabilize the estimation of image correspondences and motion vectors, which are most often central entities needed for computating scene structure, distances, etc. The method is significantly less complex and much faster than a full egomotion computation from features, such as PTAM [6], but it can be used for providing motion priors and reduce search spaces for more complex methods which perform a complete analysis of egomotion and dynamic 3D structure of the scene in which a vehicle moves.

  • 30.
    Bengtsson, Morgan
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Indoor 3D Mapping using Kinect2014Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In recent years several depth cameras have emerged on the consumer market, creating many interesting possibilities forboth professional and recreational usage. One example of such a camera is the Microsoft Kinect sensor originally usedwith the Microsoft Xbox 360 game console. In this master thesis a system is presented that utilizes this device in order to create an as accurate as possible 3D reconstruction of an indoor environment. The major novelty of the presented system is the data structure based on signed distance fields and voxel octrees used to represent the observed environment.

    Download full text (pdf)
    fulltext
  • 31.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology.
    Classification of leakage detections acquired by airborne thermography of district heating networks2013Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In Sweden and many other northern countries, it is common for heat to be distributed to homes and industries through district heating networks. Such networks consist of pipes buried underground carrying hot water or steam with temperatures in the range of 90-150 C. Due to bad insulation or cracks, heat or water leakages might appear.

    A system for large-scale monitoring of district heating networks through remote thermography has been developed and is in use at the company Termisk Systemteknik AB. Infrared images are captured from an aircraft and analysed, finding and indicating the areas for which the ground temperature is higher than normal. During the analysis there are, however, many other warm areas than true water or energy leakages that are marked as detections. Objects or phenomena that can cause false alarms are those who, for some reason, are warmer than their surroundings, for example, chimneys, cars and heat leakages from buildings.

    During the last couple of years, the system has been used in a number of cities. Therefore, there exists a fair amount of examples of different types of detections. The purpose of the present master’s thesis is to evaluate the reduction of false alarms of the existing analysis that can be achieved with the use of a learning system, i.e. a system which can learn how to recognize different types of detections. 

    A labelled data set for training and testing was acquired by contact with customers. Furthermore, a number of features describing the intensity difference within the detection, its shape and propagation as well as proximity information were found, implemented and evaluated. Finally, four different classifiers and other methods for classification were evaluated.

    The method that obtained the best results consists of two steps. In the initial step, all detections which lie on top of a building are removed from the data set of labelled detections. The second step consists of classification using a Random forest classifier. Using this two-step method, the number of false alarms is reduced by 43% while the percentage of water and energy detections correctly classified is 99%.

    Download full text (pdf)
    fulltext
  • 32.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Detection and Tracking in Thermal Infrared Imagery2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as it is possible to measure a temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

    This thesis addresses the problem of detection and tracking in thermal infrared imagery. Visual detection and tracking of objects in video are research areas that have been and currently are subject to extensive research. Indications oftheir popularity are recent benchmarks such as the annual Visual Object Tracking (VOT) challenges, the Object Tracking Benchmarks, the series of workshops on Performance Evaluation of Tracking and Surveillance (PETS), and the workshops on Change Detection. Benchmark results indicate that detection and tracking are still challenging problems.

    A common belief is that detection and tracking in thermal infrared imagery is identical to detection and tracking in grayscale visual imagery. This thesis argues that the preceding allegation is not true. The characteristics of thermal infrared radiation and imagery pose certain challenges to image analysis algorithms. The thesis describes these characteristics and challenges as well as presents evaluation results confirming the hypothesis.

    Detection and tracking are often treated as two separate problems. However, some tracking methods, e.g. template-based tracking methods, base their tracking on repeated specific detections. They learn a model of the object that is adaptively updated. That is, detection and tracking are performed jointly. The thesis includes a template-based tracking method designed specifically for thermal infrared imagery, describes a thermal infrared dataset for evaluation of template-based tracking methods, and provides an overview of the first challenge on short-term,single-object tracking in thermal infrared video. Finally, two applications employing detection and tracking methods are presented.

    Download full text (pdf)
    fulltext
    Download (pdf)
    omslag
    Download (jpg)
    presentationsbild
  • 33.
    Berg, Amanda
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Learning to Analyze what is Beyond the Visible Spectrum2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Thermal cameras have historically been of interest mainly for military applications. Increasing image quality and resolution combined with decreasing camera price and size during recent years have, however, opened up new application areas. They are now widely used for civilian applications, e.g., within industry, to search for missing persons, in automotive safety, as well as for medical applications. Thermal cameras are useful as soon as there exists a measurable temperature difference. Compared to cameras operating in the visual spectrum, they are advantageous due to their ability to see in total darkness, robustness to illumination variations, and less intrusion on privacy.

    This thesis addresses the problem of automatic image analysis in thermal infrared images with a focus on machine learning methods. The main purpose of this thesis is to study the variations of processing required due to the thermal infrared data modality. In particular, three different problems are addressed: visual object tracking, anomaly detection, and modality transfer. All these are research areas that have been and currently are subject to extensive research. Furthermore, they are all highly relevant for a number of different real-world applications.

    The first addressed problem is visual object tracking, a problem for which no prior information other than the initial location of the object is given. The main contribution concerns benchmarking of short-term single-object (STSO) visual object tracking methods in thermal infrared images. The proposed dataset, LTIR (Linköping Thermal Infrared), was integrated in the VOT-TIR2015 challenge, introducing the first ever organized challenge on STSO tracking in thermal infrared video. Another contribution also related to benchmarking is a novel, recursive, method for semi-automatic annotation of multi-modal video sequences. Based on only a few initial annotations, a video object segmentation (VOS) method proposes segmentations for all remaining frames and difficult parts in need for additional manual annotation are automatically detected. The third contribution to the problem of visual object tracking is a template tracking method based on a non-parametric probability density model of the object's thermal radiation using channel representations.

    The second addressed problem is anomaly detection, i.e., detection of rare objects or events. The main contribution is a method for truly unsupervised anomaly detection based on Generative Adversarial Networks (GANs). The method employs joint training of the generator and an observation to latent space encoder, enabling stratification of the latent space and, thus, also separation of normal and anomalous samples. The second contribution is the previously unaddressed problem of obstacle detection in front of moving trains using a train-mounted thermal camera. Adaptive correlation filters are updated continuously and missed detections of background are treated as detections of anomalies, or obstacles. The third contribution to the problem of anomaly detection is a method for characterization and classification of automatically detected district heat leakages for the purpose of false alarm reduction.

    Finally, the thesis addresses the problem of modality transfer between thermal infrared and visual spectrum images, a previously unaddressed problem. The contribution is a method based on Convolutional Neural Networks (CNNs), enabling perceptually realistic transformations of thermal infrared to visual images. By careful design of the loss function the method becomes robust to image pair misalignments. The method exploits the lower acuity for color differences than for luminance possessed by the human visual system, separating the loss into a luminance and a chrominance part.

    Download full text (pdf)
    Learning to Analyze what is Beyond the Visible Spectrum
    Download (png)
    presentationsbild
  • 34.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Classification and temporal analysis of district heating leakages in thermal images2014In: Proceedings of The 14th International Symposium on District Heating and Cooling, 2014Conference paper (Other academic)
    Abstract [en]

    District heating pipes are known to degenerate with time and in some cities the pipes have been used for several decades. Due to bad insulation or cracks, energy or media leakages might appear. This paper presents a complete system for large-scale monitoring of district heating networks, including methods for detection, classification and temporal characterization of (potential) leakages. The system analyses thermal infrared images acquired by an aircraft-mounted camera, detecting the areas for which the pixel intensity is higher than normal. Unfortunately, the system also finds many false detections, i.e., warm areas that are not caused by media or energy leakages. Thus, in order to reduce the number of false detections we describe a machine learning method to classify the detections. The results, based on data from three district heating networks show that we can remove more than half of the false detections. Moreover, we also propose a method to characterize leakages over time, that is, repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss.

    Download full text (pdf)
    fulltext
  • 35.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Information Coding. Linköping University, The Institute of Technology. Termisk Systemteknik AB, Linköping, Sweden.
    Classification of leakage detections acquired by airborne thermography of district heating networks2014In: 2014 8th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), IEEE , 2014, p. 1-4Conference paper (Refereed)
    Abstract [en]

    We address the problem of reducing the number offalse alarms among automatically detected leakages in districtheating networks. The leakages are detected in images capturedby an airborne thermal camera, and each detection correspondsto an image region with abnormally high temperature. Thisapproach yields a significant number of false positives, and wepropose to reduce this number in two steps. First, we use abuilding segmentation scheme in order to remove detectionson buildings. Second, we extract features from the detectionsand use a Random forest classifier on the remaining detections.We provide extensive experimental analysis on real-world data,showing that this post-processing step significantly improves theusefulness of the system.

    Download full text (pdf)
    fulltext
  • 36.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB Linköping, Sweden.
    Classifying district heating network leakages in aerial thermal imagery2014Conference paper (Other academic)
    Abstract [en]

    In this paper we address the problem of automatically detecting leakages in underground pipes of district heating networks from images captured by an airborne thermal camera. The basic idea is to classify each relevant image region as a leakage if its temperature exceeds a threshold. This simple approach yields a significant number of false positives. We propose to address this issue by machine learning techniques and provide extensive experimental analysis on real-world data. The results show that this postprocessing step significantly improves the usefulness of the system.

    Download full text (pdf)
    fulltext
  • 37.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A thermal infrared dataset for evaluation of short-term tracking methods2015Conference paper (Other academic)
    Abstract [en]

    During recent years, thermal cameras have decreased in both size and cost while improving image quality. The area of use for such cameras has expanded with many exciting applications, many of which require tracking of objects. While being subject to extensive research in the visual domain, tracking in thermal imagery has historically been of interest mainly for military purposes. The available thermal infrared datasets for evaluating methods addressing these problems are few and the ones that do are not challenging enough for today’s tracking algorithms. Therefore, we hereby propose a thermal infrared dataset for evaluation of short-term tracking methods. The dataset consists of 20 sequences which have been collected from multiple sources and the data format used is in accordance with the Visual Object Tracking (VOT) Challenge.

    Download full text (pdf)
    fulltext
  • 38.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    A Thermal Object Tracking Benchmark2015Conference paper (Refereed)
    Abstract [en]

    Short-term single-object (STSO) tracking in thermal images is a challenging problem relevant in a growing number of applications. In order to evaluate STSO tracking algorithms on visual imagery, there are de facto standard benchmarks. However, we argue that tracking in thermal imagery is different than in visual imagery, and that a separate benchmark is needed. The available thermal infrared datasets are few and the existing ones are not challenging for modern tracking algorithms. Therefore, we hereby propose a thermal infrared benchmark according to the Visual Object Tracking (VOT) protocol for evaluation of STSO tracking methods. The benchmark includes the new LTIR dataset containing 20 thermal image sequences which have been collected from multiple sources and annotated in the format used in the VOT Challenge. In addition, we show that the ranking of different tracking principles differ between the visual and thermal benchmarks, confirming the need for the new benchmark.

    Download full text (pdf)
    fulltext
  • 39.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Channel Coded Distribution Field Tracking for Thermal Infrared Imagery2016In: PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), IEEE , 2016, p. 1248-1256Conference paper (Refereed)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. The fast progress has been possible thanks to the development of new template-based tracking methods with online template updates, methods which have not been explored for TIR tracking. Instead, tracking methods used for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a template-based tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. In order to avoid background contamination of the object template, we propose to exploit background information for the online template update and to adaptively select the object region used for tracking. Moreover, we propose a novel method for estimating object scale change. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Further, the proposed tracker, ABCD, and the VOT-TIR2015 winner SRDCFir are evaluated on maritime data. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Download full text (pdf)
    fulltext
  • 40.
    Berg, Amanda
    et al.
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Electrical Engineering, Computer Vision. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Syst Tekn AB, Diskettgatan 11 B, SE-58335 Linkoping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Enhanced analysis of thermographic images for monitoring of district heat pipe networks2016In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 83, no 2, p. 215-223Article in journal (Refereed)
    Abstract [en]

    We address two problems related to large-scale aerial monitoring of district heating networks. First, we propose a classification scheme to reduce the number of false alarms among automatically detected leakages in district heating networks. The leakages are detected in images captured by an airborne thermal camera, and each detection corresponds to an image region with abnormally high temperature. This approach yields a significant number of false positives, and we propose to reduce this number in two steps; by (a) using a building segmentation scheme in order to remove detections on buildings, and (b) to use a machine learning approach to classify the remaining detections as true or false leakages. We provide extensive experimental analysis on real-world data, showing that this post-processing step significantly improves the usefulness of the system. Second, we propose a method for characterization of leakages over time, i.e., repeating the image acquisition one or a few years later and indicate areas that suffer from an increased energy loss. We address the problem of finding trends in the degradation of pipe networks in order to plan for long-term maintenance, and propose a visualization scheme exploiting the consecutive data collections.

    Download full text (pdf)
    fulltext
  • 41.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Generating Visible Spectrum Images from Thermal Infrared2018In: Proceedings 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops CVPRW 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1224-1233Conference paper (Refereed)
    Abstract [en]

    Transformation of thermal infrared (TIR) images into visual, i.e. perceptually realistic color (RGB) images, is a challenging problem. TIR cameras have the ability to see in scenarios where vision is severely impaired, for example in total darkness or fog, and they are commonly used, e.g., for surveillance and automotive applications. However, interpretation of TIR images is difficult, especially for untrained operators. Enhancing the TIR image display by transforming it into a plausible, visual, perceptually realistic RGB image presumably facilitates interpretation. Existing grayscale to RGB, so called, colorization methods cannot be applied to TIR images directly since those methods only estimate the chrominance and not the luminance. In the absence of conventional colorization methods, we propose two fully automatic TIR to visual color image transformation methods, a two-step and an integrated approach, based on Convolutional Neural Networks. The methods require neither pre- nor postprocessing, do not require any user input, and are robust to image pair misalignments. We show that the methods do indeed produce perceptually realistic results on publicly available data, which is assessed both qualitatively and quantitatively.

    Download full text (pdf)
    Generating Visible Spectrum Images from Thermal Infrared
  • 42.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields2017Conference paper (Other academic)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. Tracking methods designed for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a templatebased tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

    Download full text (pdf)
    Object Tracking in Thermal Infrared Imagery based on Channel Coded Distribution Fields
  • 43.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Unsupervised Adversarial Learning of Anomaly Detection in the Wild2020In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI) / [ed] Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang, Amsterdam: IOS Press, 2020, Vol. 325, p. 1002-1008Conference paper (Refereed)
    Abstract [en]

    Unsupervised learning of anomaly detection in high-dimensional data, such as images, is a challenging problem recently subject to intense research. Through careful modelling of the data distribution of normal samples, it is possible to detect deviant samples, so called anomalies. Generative Adversarial Networks (GANs) can model the highly complex, high-dimensional data distribution of normal image samples, and have shown to be a suitable approach to the problem. Previously published GAN-based anomaly detection methods often assume that anomaly-free data is available for training. However, this assumption is not valid in most real-life scenarios, a.k.a. in the wild. In this work, we evaluate the effects of anomaly contaminations in the training data on state-of-the-art GAN-based anomaly detection methods. As expected, detection performance deteriorates. To address this performance drop, we propose to add an additional encoder network already at training time and show that joint generator-encoder training stratifies the latent space, mitigating the problem with contaminated data. We show experimentally that the norm of a query image in this stratified latent space becomes a highly significant cue to discriminate anomalies from normal data. The proposed method achieves state-of-the-art performance on CIFAR-10 as well as on a large, previously untested dataset with cell images.

    Download full text (pdf)
    fulltext
  • 44.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Visual Spectrum Image Generation fromThermal Infrared2019Conference paper (Other academic)
    Abstract [en]

    We address short-term, single-object tracking, a topic that is currently seeing fast progress for visual video, for the case of thermal infrared (TIR) imagery. Tracking methods designed for TIR are often subject to a number of constraints, e.g., warm objects, low spatial resolution, and static camera. As TIR cameras become less noisy and get higher resolution these constraints are less relevant, and for emerging civilian applications, e.g., surveillance and automotive safety, new tracking methods are needed. Due to the special characteristics of TIR imagery, we argue that template-based trackers based on distribution fields should have an advantage over trackers based on spatial structure features. In this paper, we propose a templatebased tracking method (ABCD) designed specifically for TIR and not being restricted by any of the constraints above. The proposed tracker is evaluated on the VOT-TIR2015 and VOT2015 datasets using the VOT evaluation toolkit and a comparison of relative ranking of all common participating trackers in the challenges is provided. Experimental results show that the ABCD tracker performs particularly well on thermal infrared sequences.

  • 45.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Häger, Gustav
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    An Overview of the Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge2016Conference paper (Other academic)
    Abstract [en]

    The Thermal Infrared Visual Object Tracking (VOT-TIR2015) Challenge was organized in conjunction with ICCV2015. It was the first benchmark on short-term,single-target tracking in thermal infrared (TIR) sequences. The challenge aimed at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. It was based on the VOT2013 Challenge, but introduced the following novelties: (i) the utilization of the LTIR (Linköping TIR) dataset, (ii) adaption of the VOT2013 attributes to thermal data, (iii) a similar evaluation to that of VOT2015. This paper provides an overview of the VOT-TIR2015 Challenge as well as the results of the 24 participating trackers.

    Download full text (pdf)
    fulltext
  • 46.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Johnander, Joakim
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Zenuity AB, Göteborg, Sweden.
    Durand de Gevigney, Flavie
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Grenoble INP, France.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Semi-automatic Annotation of Objects in Visual-Thermal Video2019In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Institute of Electrical and Electronics Engineers (IEEE), 2019Conference paper (Refereed)
    Abstract [en]

    Deep learning requires large amounts of annotated data. Manual annotation of objects in video is, regardless of annotation type, a tedious and time-consuming process. In particular, for scarcely used image modalities human annotationis hard to justify. In such cases, semi-automatic annotation provides an acceptable option.

    In this work, a recursive, semi-automatic annotation method for video is presented. The proposed method utilizesa state-of-the-art video object segmentation method to propose initial annotations for all frames in a video based on only a few manual object segmentations. In the case of a multi-modal dataset, the multi-modality is exploited to refine the proposed annotations even further. The final tentative annotations are presented to the user for manual correction.

    The method is evaluated on a subset of the RGBT-234 visual-thermal dataset reducing the workload for a human annotator with approximately 78% compared to full manual annotation. Utilizing the proposed pipeline, sequences are annotated for the VOT-RGBT 2019 challenge.

    Download full text (pdf)
    fulltext
  • 47.
    Berg, Amanda
    et al.
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Öfjäll, Kristoffer
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Termisk Systemteknik AB, Linköping, Sweden.
    Felsberg, Michael
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Detecting Rails and Obstacles Using a Train-Mounted Thermal Camera2015In: Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, June 15-17, 2015. Proceedings / [ed] Rasmus R. Paulsen; Kim S. Pedersen, Springer, 2015, p. 492-503Conference paper (Refereed)
    Abstract [en]

    We propose a method for detecting obstacles on the railway in front of a moving train using a monocular thermal camera. The problem is motivated by the large number of collisions between trains and various obstacles, resulting in reduced safety and high costs. The proposed method includes a novel way of detecting the rails in the imagery, as well as a way to detect anomalies on the railway. While the problem at a first glance looks similar to road and lane detection, which in the past has been a popular research topic, a closer look reveals that the problem at hand is previously unaddressed. As a consequence, relevant datasets are missing as well, and thus our contribution is two-fold: We propose an approach to the novel problem of obstacle detection on railways and we describe the acquisition of a novel data set.

    Download full text (pdf)
    fulltext
  • 48.
    Berglin, Lukas
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Design, Evaluation and Implementation of a Pipeline for Semi-Automatic Lung Nodule Segmentation2016Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Lung cancer is the most common type of cancer in the world and always manifests as lung nodules. Nodules are small tumors that consist of lung tissue. They are usually spherical in shape and their cores can be either solid or subsolid. Nodules are common in lungs, but not all of them are malignant. To determine if a nodule is malignant or benign, attributes like nodule size and volume growth are commonly used. The procedure to obtain these attributes is time consuming, and therefore calls for tools to simplify the process.

    The purpose of this thesis work was to investigate  the feasibility of a semi-automatic lungnodule segmentation pipeline including volume estimation. This was done by implementing, tuning and evaluating image processing algorithms with different characteristics to create pipeline candidates. These candidates were compared using a similarity index between their segmentation results and ground truth markings to determine the most promising one.

    The best performing pipeline consisted of a fixed region of interest together with a level set segmentation algorithm. Its segmentation accuracy was not consistent for all nodules evaluated, but the pipeline showed great potential when dynamically adapting its parameters for each nodule. The use of dynamic parameters was only brie y explored, and further research would be necessary to determine its feasibility.

    Download full text (pdf)
    fulltext
  • 49.
    Bergström, Linus
    Linköping University, Department of Electrical Engineering, Computer Vision.
    Video Based Cross-View Geo-Localization of UAV Imagery2024Independent thesis Advanced level (degree of Master (Two Years)), 28 HE creditsStudent thesis
    Abstract [en]

    This master’s thesis investigates the potential of enhancing cross-view localization (CVL) of UAV imagery using video input. The study, conducted in collaboration with Maxar Intelligence and Linköping University, aims to improve localization performance by leveraging sequential image data captured by UAVs and with satellite imagery as a reference. The thesis explores three main questions; the potential of modifying single-image CVL models to process image sequences, the adaptability of existing ground-to-satellite video CVL models for UAV-to-satellite localization, and the performance comparison of aerial views and satellite imagery extracted from Maxar’s 3D reconstructed models with public datasets.

    To address these questions, synthetic datasets mimicking University-1652 and SUES-200 were created using Maxar’s 3D data. The results demonstrated that while single-image models like Sample4Geo achieved high accuracy on synthetic datasets, their performance diminished with data that more closely represented the real world due to overfitting to the uniformity of synthetic environments. Existing ground-to-satellite video CVL models faced practical challenges when adapted to UAV perspectives, highlighting the need for tailored approaches. The synthetic datasets from Maxar outperformed public datasets but lacked the variability and realism of real-world conditions.

    Furthermore, the study introduced a novel multiple-image CVL model that leverages a post-processing technique using density-based clustering (OPTICS) to enhance localization accuracy by filtering noisy predictions and identifying high confidence clusters. The findings indicate that while this approach improves localization performance, significant advancements are required to fully utilize video input for UAV-to-satellite localization in dynamic environments.

    Download full text (pdf)
    fulltext
  • 50.
    Bešenić, Krešimir
    et al.
    Faculty of Electrical Engineering and Computing, University of Zagreb,.
    Ahlberg, Jörgen
    Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
    Pandžić, Igor
    Faculty of Electrical Engineering and Computing, University of Zagreb,.
    Let Me Take a Better Look: Towards Video-Based Age Estimation2024In: Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods - ICPRAM, Rome , Italy, 2024, p. 57-59Conference paper (Refereed)
    Abstract [en]

    Taking a better look at subjects of interest helps humans to improve confidence in their age estimation. Unlike still images, sequences offer spatio-temporal dynamic information that contains many cues related to age progression. A review of previous work on video-based age estimation indicates that this is an underexplored field of research. This may be caused by a lack of well-defined and publicly accessible video benchmark protocol, as well as the absence of video-oriented training data. To address the former issue, we propose a carefully designed video age estimation benchmark protocol and make it publicly available. To address the latter issue, we design a video-specific age estimation method that leverages pseudo-labeling and semi-supervised learning. Our results show that the proposed method outperforms image-based baselines on both offline and online benchmark protocols, while the online estimation stability is improved by more than 50%.

    Download full text (pdf)
    fulltext
1234567 1 - 50 of 661
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf