Concepedia

Abstract

In the image fusion mission, the crucial task is to generate high-quality images for highlighting the key objects while enhancing the scenes to be understood. To complete this task and provide a powerful interpretability as well as a strong generalization ability in producing enjoyable fusion results which are comfortable for vision tasks (such as objects detection and their segmentation), we present a novel interpretable decomposition scheme and develop a target-aware Taylor expansion approximation (T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> EA) network for infrared and visible image fusion, where our T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> EA includes the following key procedures: Firstly, visible and infrared images are both decomposed into feature maps through a designed Taylor expansion approximation (TEA) network. Then, the Taylor feature maps are hierarchically fused by a dual-branch feature fusion (DBFF) network. Next, the fused map of each layer is contributed to synthesize an enjoyable fusion result by the inverse Taylor expansion. Finally, a segmentation network is jointed to refine the fusion network parameters which can promote the pleasing fusion results to be more suitable for segmenting the objects. To validate the effectiveness of our reported T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> EA network, we first discuss the selection of Taylor expansion layers and fusion strategies. Then, both quantitatively and qualitatively experimental results generated by the selected SOTA approaches on three datasets ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">MSRS, TNO</i> , and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LLVIP</i> ) are compared in testing, generalization, and target detection and segmentation, demonstrating that our T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> EA can produce more competitive fusion results for vision tasks and is more powerful for image adaption. The code will be available at https://github.com/MysterYxby/T <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> EA.

References

YearCitations

Page 1