Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/26633
Title: Scale-wise interaction fusion and knowledge distillation network for aerial scene recognition
Authors: Ning, H
Lei, T
An, M
Sun, H
Hu, Z
Nandi, AK
Keywords: deep learning;image analysis;image classification;information fusion
Issue Date: 4-Mar-2023
Publisher: Wiley on behalf of The Institution of Engineering and Technology and Chongqing University of Technology
Citation: Ning, H. et al. (2023) 'Scale-wise interaction fusion and knowledge distillation network for aerial scene recognition', CAAI Transactions on Intelligence Technology, 0 (ahead-of-print), pp. 1 - 13. doi: 10.1049/cit2.12208.
Abstract: Copyright © 2023 The Authors. Aerial scene recognition (ASR) has attracted great attention due to its increasingly essential applications. Most of the ASR methods adopt the multi-scale architecture because both global and local features play great roles in ASR. However, the existing multi-scale methods neglect the effective interactions among different scales and various spatial locations when fusing global and local features, leading to a limited ability to deal with challenges of large-scale variation and complex background in aerial scene images. In addition, existing methods may suffer from poor generalisations due to millions of to-be-learnt parameters and inconsistent predictions between global and local features. To tackle these problems, this study proposes a scale-wise interaction fusion and knowledge distillation (SIF-KD) network for learning robust and discriminative features with scale-invariance and background-independent information. The main highlights of this study include two aspects. On the one hand, a global-local features collaborative learning scheme is devised for extracting scale-invariance features so as to tackle the large-scale variation problem in aerial scene images. Specifically, a plug-and-play multi-scale context attention fusion module is proposed for collaboratively fusing the context information between global and local features. On the other hand, a scale-wise knowledge distillation scheme is proposed to produce more consistent predictions by distilling the predictive distribution between different scales during training. Comprehensive experimental results show the proposed SIF-KD network achieves the best overall accuracy with 99.68%, 98.74% and 95.47% on the UCM, AID and NWPU-RESISC45 datasets, respectively, compared with state of the arts.
Description: Data availability statement: Data sharing is not applicable to this article as no new data were created or analysed in this study.
URI: https://bura.brunel.ac.uk/handle/2438/26633
DOI: https://doi.org/10.1049/cit2.12208
ISSN: 2468-6557
Other Identifiers: ORCID iDs: Hailong Ning https://orcid.org/0000-0001-8375-1181; Asoke K. Nandi https://orcid.org/0000-0001-6248-2875.
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2023 The Authors. CAAI Transactions on Intelligence Technology published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology and Chongqing University of Technology. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.2.66 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons