2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的视频超分辨率重建算法进展

唐麒 赵耀 刘美琴 姚超

唐麒, 赵耀, 刘美琴, 姚超. 基于深度学习的视频超分辨率重建算法进展. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240235
引用本文: 唐麒, 赵耀, 刘美琴, 姚超. 基于深度学习的视频超分辨率重建算法进展. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240235
Tang Qi, Zhao Yao, Liu Mei-Qin, Yao Chao. A review of video super-resolution algorithms based on deep learning. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240235
Citation: Tang Qi, Zhao Yao, Liu Mei-Qin, Yao Chao. A review of video super-resolution algorithms based on deep learning. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240235

基于深度学习的视频超分辨率重建算法进展

doi: 10.16383/j.aas.c240235 cstr: 32138.14.j.aas.c240235
基金项目: 中央高校基本科研业务费专项资金资助(2024JBZX001), 国家自然科学基金(62120106009, 62332017, 62372036) 资助
详细信息
    作者简介:

    唐麒:北京交通大学信息科学研究所硕士研究生. 主要研究方向为图像与视频复原. E-mail: qitang@bjtu.edu.cn

    赵耀:北京交通大学信息科学研究所教授. 主要研究方向为图像/视频压缩, 数字媒体内容安全, 媒体内容分析与理解, 人工智能. E-mail: yzhao@bjtu.edu.cn

    刘美琴:北京交通大学信息科学研究所教授. 主要研究方向为多媒体信息处理, 三维视频处理, 视频智能编码. 本文通信作者. E-mail: mqliu@bjtu.edu.cn

    姚超:北京科技大学计算机与通信工程学院副教授. 主要研究方向为图像/视频压缩, 计算机视觉和人机交互. E-mail: yaochao@ustb.edu.cn

A Review of Video Super-resolution Algorithms Based on Deep Learning

Funds: Supported by Fundamental Research Funds for the Central Universities (2024JBZX001), and National Natural Science Foundation of China (62120106009, 62332017, 62372036)
More Information
    Author Bio:

    TANG Qi  Master student at Institute of Information Science, Beijing Jiaotong University. His research interest covers image and video restoration

    ZHAO Yao  Professor at Institute of Information Science, Beijing Jiaotong University. His research interest covers image/video compression, digital media content security, media content analysis and understanding, artificial intelligence

    LIU Mei-Qin  Professor at Institute of Information Science, Beijing Jiaotong University. Her research interest covers multimedia information processing, 3D video processing and video intelligent coding. Corresponding author of this paper

    YAO Chao  Associate Professor at School of Computer and Communication Engineering, University of Science and Technology Beijing. His research interest covers image/video compression, computer vision, and human–computer interaction

  • 摘要: 视频超分辨率重建(Video super-resolution, VSR)是底层计算机视觉任务中的一个重要研究方向, 旨在利用低分辨率视频的帧内和帧间信息, 重建具有更多细节和内容一致的高分辨率视频, 有助于提升下游任务性能和改善用户观感体验. 近年来, 基于深度学习的视频超分辨率重建算法如雨后春笋般涌现, 在帧间对齐、信息传播等方面取得了突破性的进展. 在简述视频超分辨率重建任务的基础上, 梳理了现有的视频超分辨率重建的公共数据集及相关算法; 接着, 重点综述了基于深度学习的视频超分辨率重建算法的创新性工作进展情况; 最后, 总结了视频超分辨率重建算法面临的挑战及未来的发展趋势.
  • 图  1  视频超分辨率重建数据集REDS (左)和Vimeo-90K (右)示例

    Fig.  1  Examples of video super-resolution datasets from REDS (left) and Vimeo-90K (right)

    图  2  部分VSR模型在REDS据集的可视化比较结果

    Fig.  2  Visual comparison results of VSR methods on REDS dataset

    图  3  部分VSR模型在Vid4数据集的可视化比较结果

    Fig.  3  Visual comparison results of VSR methods on Vid4 dataset

    图  4  本文的结构图

    Fig.  4  Architecture of the paper

    图  5  基于深度学习的视频超分辨率重建时间脉络图

    Fig.  5  Timeline of video super-resolution based on deep learning

    图  6  VSRNet结构图

    Fig.  6  Architecture of VSRNet

    图  7  VESPCN结构图

    Fig.  7  Architecture of VESPCN

    图  8  SOFVSR结构图

    Fig.  8  Architecture of SOFVSR

    图  9  TOFlow结构图

    Fig.  9  Architecture of TOFlow

    图  10  DUF结构

    Fig.  10  Architecture of DUF

    图  11  FSTRN结构图

    Fig.  11  Architecture of FSTRN

    图  12  TDAN结构图

    Fig.  12  Architecture of TDAN

    图  13  EDVR结构图

    Fig.  13  Architecture of EDVR

    图  14  TGA结构图

    Fig.  14  Architecture of TGA

    图  15  MuCAN结构图

    Fig.  15  Architecture of MuCAN

    图  16  MANA结构图

    Fig.  16  Architecture of MANA

    图  17  IAM结构图

    Fig.  17  Architecture of IAM

    图  18  VSR Transformer结构图

    Fig.  18  Architecture of VSR Transformer

    图  19  VRT结构图

    Fig.  19  Architecture of VRT

    图  20  DRVSR结构图

    Fig.  20  Architecture of DRVSR

    图  21  FRVSR结构图

    Fig.  21  Architecture of FRVSR

    图  22  RBPN结构图

    Fig.  22  Architecture of RBPN

    图  23  RLSP结构图

    Fig.  23  Architecture of RLSP

    图  24  RSDN结构图

    Fig.  24  Architecture of RSDN

    图  25  RRN结构图

    Fig.  25  Architecture of RRN

    图  26  DAP结构图

    Fig.  26  Architecture of DAP

    图  27  ETDM结构图

    Fig.  27  Architecture of ETDM

    图  28  TMP结构图

    Fig.  28  Architecture of TMP

    图  29  BRCN结构图

    Fig.  29  Architecture of BRCN

    图  30  RRCN结构图

    Fig.  30  Architecture of RRCN

    图  31  PFNL结构图和PFRB细节图

    Fig.  31  Architecture of PFNL and Detail of PFRB

    图  32  RISTN结构图

    Fig.  32  Architecture of RISTN

    图  33  LOVSR(左)和GOVSR(右)结构图

    Fig.  33  Architectures of LOVSR (left) and GOVSR (right)

    图  34  BasicVSR(左)和ICONVSR(右)结构图

    Fig.  34  Architectures of BasicVSR (left) and ICONVSR (right)

    图  35  TTVSR结构图

    Fig.  35  Architecture of TTVSR

    图  36  CTVSR结构图

    Fig.  36  Architecture of CTVSR

    图  37  RefVSR结构图

    Fig.  37  Architecture of RefVSR

    图  38  C2-Mathching结构图

    Fig.  38  Architecture of C2-Mathching

    图  39  RealBasicVSR结构图

    Fig.  39  Architecture of RealBasicVSR

    图  40  FTVSR结构图

    Fig.  40  Architecture of FTVSR

    图  41  BasicVSR++ 结构图

    Fig.  41  Architecture of BasicVSR++

    图  42  PSRT结构图

    Fig.  42  Architecture of PSRT

    图  43  IART结构图

    Fig.  43  Architecture of IART

    图  44  MFPI结构图

    Fig.  44  Architecture of MFPI

    图  45  DFVSR结构图

    Fig.  45  Architecture of DFVSR

    图  46  MIA-VSR结构图

    Fig.  46  Architecture of MIA-VSR

    图  47  RVRT结构图

    Fig.  47  Architecture of RVRT

    图  48  TeoGAN结构图

    Fig.  48  Architecture of TeoGAN

    图  49  StableVSR结构图

    Fig.  49  Architecture of StableVSR

    图  50  MGLD结构图

    Fig.  50  Architecture of MGLD

    图  51  Upscale-A-Video结构图

    Fig.  51  Architecture of Upscale-A-Video

    图  52  不同帧间对齐模式示意图

    Fig.  52  Illustration of different inter-frame alignment

    图  53  基于光流的显式运动对齐

    Fig.  53  Explicit alignment based on optical flow

    图  54  基于可变形卷积的对齐

    Fig.  54  Deformable convolution-based alignment

    图  55  光流引导的可变形对齐和光流引导的可变形注意力

    Fig.  55  Flow-guided deformable alignment and flow-guided deformable attention

    图  56  基于3D卷积的帧间对齐

    Fig.  56  Inter-frame alignment based on 3D convolution

    表  1  基于深度学习的视频超分辨率重建数据集

    Table  1  Datasets of video super-resolution based on deep learning

    数据集 类型 视频数量 帧数 分辨率 颜色空间
    合成数据集YUV25[15]训练集25386 $ \times $ 288YUV
    TDTFF[16]Turbine测试集5648 $ \times $ 528YUV
    Dancing950 $ \times $ 530
    Treadmill700$ \times $600
    Flag1000 $ \times $ 580
    Fan990 $ \times $ 740
    Vid4[13]Foliage测试集449720 $ \times $ 480RGB
    Walk47720 $ \times $ 480
    Calendar41720 $ \times $ 576
    City34704 $ \times $ 576
    YUV21[17]测试集21100352 $ \times $ 288YUV
    Venice[18]训练集11 0773 840$ \times $2 160RGB
    Myanmar[19]训练集15273 840$ \times $2 160RGB
    CDVL[20]训练集100301 920$ \times $1 080RGB
    UVGD[21]测试集163 840$ \times $2 160YUV
    LMT[22]训练集261 920$ \times $1 080YCbCr
    SPMCS[23]训练集和测试集97531960$ \times $540RGB
    MM542[24]训练集542321 280$ \times $720RGB
    UDM10[25]测试集10321 272$ \times $720RGB
    Vimeo-90K[12]训练集和测试集91 7017448$ \times $256RGB
    REDS[14]训练集和测试集2701001 280$ \times $720RGB
    Parkour[26]测试集14960$ \times $540RGB
    真实数据集RealVSR[27]训练集和测试集500501 024$ \times $512RGB/YCbCr
    VideoLQ[28]测试集501001 024$ \times $512RGB
    RealMCVSR[29]训练集和测试集1611 920$ \times $1 080RGB
    MVSR4$ \times $[30]训练集和测试集3001001 920$ \times $1 080RGB
    DTVIT[31]训练集和测试集1961001 920$ \times $1 080RGB
    YouHQ[32]训练集和测试集38 616321 920$ \times $1 080RGB
    下载: 导出CSV

    表  2  对双三次插值下采样后的视频进行VSR的性能对比结果

    Table  2  Performance comparison of video super-resolution algorithm with bicubic downsampling

    对比方法 训练帧数 参数量(M) 双三次插值下采样
    REDS (RGB通道) Vimeo-90K-T (Y通道) Vid4 (Y通道)
    Bicubic 26.14/0.7292 31.32/0.8684 23.78/0.6347
    VSRNet[40] 0.27 −/− −/− 22.81/0.6500
    VSRResFeatGAN[41] −/− −/− 24.50/0.7023
    VESPCN[42] −/− −/− 25.35/0.7577
    VSRResNet[41] −/− −/− 25.51/0.7530
    SPMC[23] 2.17 −/− −/− 25.52/0.7600
    3DSRNet[43] −/− −/− 25.71/0.7588
    RRCN[44] −/− −/− 25.86/0.7591
    TOFlow[12] 5/7 1.41 27.98/0.7990 33.08/0.9054 25.89/0.7651
    STARNet[45] 111.61 −/− 30.83/0.9290 −/−
    MEMC-Net[46] −/− 33.47/0.9470 24.37/0.8380
    STMN[47] −/− −/− 25.90/0.7878
    SOFVSR[48] 1.71 −/− −/− 26.01/0.7710
    RISTN[49] 3.67 −/− −/− 26.13/0.7920
    MMCNN[24] 10.58 −/− −/− 26.28/0.7844
    RTVSR[50] 15.00 −/− −/− 26.36/0.7900
    TDAN[51] 1.97 −/− −/− 26.42/0.7890
    D3DNet[52] −/7 2.58 −/− 35.65/0.9330 26.52/0.7990
    FFCVSR[53] −/− −/− 26.97/0.8300
    EVSRNet[54] 27.85/0.8000 −/− −/−
    StableVSR[55] 27.97/0.8000 −/− −/−
    DUF[56] 7/7 5.8 28.63/0.8251 −/− 27.33/0.8319
    PFNL[57] 7/7 3 29.63/0.8502 36.14/0.9363 26.73/0.8029
    DNSTNet[58] −/− 36.86/0.9387 27.21/0.8220
    RBPN[59] 7/7 12.2 30.09/0.8590 37.07/0.9435 27.12/0.8180
    DSMC[60] 11.58 30.29/0.8381 −/− 27.29/0.8403
    Boosted EDVR[31] 30.53/0.8699 −/− −/−
    TMP[61] 3.1 30.67/0.8710 −/− 27.10/0.8167
    MuCAN[62] 5/7 30.88/0.8750 37.32/0.9465 −/−
    MSFFN[63] −/− 37.33/0.9467 27.23/0.8218
    DAP[64] 15/5 30.59/0.8703 −/− −/−
    MultiBoot VSR[65] 60.86 31.00/0.8822 −/− −/−
    SSL-bi[66] 15/14 1.0 31.06/0.8933 36.82/0.9419 27.15/0.8208
    EDVR[67] 5/7 20.6 31.09/0.8800 37.61/0.9489 27.35/0.8264
    RLSP[68] 4.2 −/− 37.39/0.9470 27.15/0.8202
    TGA[69] 5.8 −/− 37.43/0.9480 27.19/0.8213
    KSNet-bi[70] 3.0 31.14/0.8862 37.54/0.9503 27.22/0.8245
    VSR-T[71] 5/7 32.6 31.19/0.8815 37.71/0.9494 27.36/0.8258
    PSRT-sliding[72] 5/− 14.8 31.32/0.8834 −/− −/−
    SeeClear[73] 5/5 229.23 31.32/0.8856 37.64/0.9503 27.80/0.8404
    DPR[74] 6.3 31.38/0.8907 37.11/0.9446 27.19/0.8243
    BasicVSR[75] 15/14 6.3 31.42/0.8909 37.18/0.9450 27.24/0.8251
    Boosted BasicVSR[31] 31.42/0.8917 −/− −/−
    SATeCo[76] 6/6 31.62/0.8932 −/− 27.44/0.8420
    IconVSR[75] 15/14 8.7 31.67/0.8948 37.47/0.9476 27.39/0.8279
    ICNet[77] 18.34 31.71/0.8963 37.72/0.9477 27.43/0.8287
    MSHPFNL[78] 7.77 −/− 36.75/0.9406 27.70/0.8472
    PA[79] 5/7 38.2 32.05/0.8941 −/− 28.02/0.8373
    FTVSR[80] 10.8 31.82/0.8960 −/− −/−
    $ C^2 $-Matching[81] 32.05/0.9010 −/− 28.87/0.8960
    ETDM[82] 8.4 32.15/0.9024 −/− −/−
    BasicVSR++[83] 30/14 7.3 32.39/0.9069 37.79/0.9500 27.79/0.8400
    RTA[84] 5/7 17 31.30/0.8850 37.84/0.9498 27.90/0.8380
    Semantic Lens[85] 5/− 31.42/0.8881 −/− −/−
    TCNet[86] 9.6 31.82/0.9002 37.94/0.9514 27.48/0.8380
    TTVSR[87] 50/− 6.8 32.12/0.9021 −/− −/−
    VRT[88] 16/7 35.6 32.19/0.9006 38.20/0.9530 27.93/0.8425
    CTVSR[89] 16/14 34.5 32.28/0.9047 −/− 28.03/0.8487
    FTVSR++[90] 10.8 32.42/0.9070 −/− −/−
    LGDFNet-BPP[91] 9.0 32.53/0.9007 −/− 27.99/0.8409
    PP-MSVSR-L[92] 7.4 32.53/0.9083 −/− −/−
    CFD-BasicVSR++[127] 30/7 7.5 32.51/0.9083 37.90/0.9504 27.84/0.8406
    RVRT[93] 30/14 10.8 32.75/0.9113 38.15/0.9527 27.99/0.8426
    DFVSR[94] 7.1 32.76/0.9081 38.25/0.9556 27.92/0.8427
    PSRT-recurrent[72] 16/14 13.4 32.72/0.9106 38.27/0.9536 28.07/0.8485
    MFPI[95] −/− 7.3 32.81/0.9106 38.28/0.9534 28.11/0.8481
    EvTexture[96] 15/− 8.9 32.79/0.9174 38.23/0.9544 29.51/0.8909
    MIA-VSR[97] 16/14 16.5 32.78/0.9220 38.22/0.9532 28.20/0.8507
    CFD-PSRT[127] 30/7 13.6 32.83/0.9140 38.33/0.9548 28.18/0.8503
    IART[98] 16/7 13.4 32.90/0.9138 38.14/0.9528 28.26/0.8517
    EvTexture+[96] 15/− 10.1 32.93/0.9195 38.32/0.9558 29.78/0.8983
    下载: 导出CSV

    表  3  对高斯模糊下采样后的视频进行VSR的性能对比结果

    Table  3  Performance comparison of video super-resolution algorithm with blur downsampling

    对比方法 训练帧数 参数量(M) 高斯模糊下采样
    UDM10 (Y通道) Vimeo-90K-T (Y通道) Vid4 (Y通道)
    Bicubic 28.47/0.8253 31.30/0.8687 21.80/0.5246
    BRCN[99] −/− −/− 24.43/0.6334
    ToFNet[12] 5/7 1.41 36.26/0.9438 34.62/0.9212 25.85/0.7659
    TecoGAN[100] 3.00 −/− −/− 25.89/−
    SOFVSR[48] 1.71 −/− −/− 26.19/0.7850
    RRN[101] 3.4 38.96/0.9644 −/− 27.69/0.8488
    TDAN[51] 1.97 −/− −/− 26.86/0.8140
    FRVSR[102] 5.1 −/− −/− 26.69/0.8220
    DUF[56] 7/7 5.8 38.48/0.9605 36.87/0.9447 27.38/0.8329
    RLSP[68] 4.2 38.48/0.9606 36.49/0.9403 27.48/0.8388
    PFNL[57] 7/7 3 38.74/0.9627 −/− 27.16/0.8355
    RBPN[59] 7/7 12.2 38.66/0.9596 37.20/0.9458 27.17/0.8205
    TMP[61] 3.1 −/− 37.33/0.9481 27.61/0.8428
    TGA[69] 5.8 38.74/0.9627 37.59/0.9516 27.63/0.8423
    SSL-bi[66] 15/14 1.0 39.35/0.9665 37.06/0.9458 27.56/0.8431
    RSDN[103] 6.19 −/− 37.23/0.9471 27.02/0.8505
    DAP[64] 15/5 39.50/0.9664 37.25/0.9472 −/−
    SeeClear[73] 5/5 229.23 39.72/0.9675 −/− −/−
    EDVR[67] 5/7 20.6 39.89/0.9686 37.81/0.9523 27.85/0.8503
    DPR[74] 6.3 39.72/0.9684 37.24/0.9461 27.89/0.8539
    BasicVSR[75] 15/14 6.3 39.96/0.9694 37.53/0.9498 27.96/0.8553
    IconVSR[75] 15/14 8.7 40.03/0.9694 37.84/0.9524 28.04/0.8570
    R2D2[104] 8.25 39.53/0.9670 −/− 28.13/0.9244
    FTVSR[80] 10.8 −/− −/− 28.31/0.8600
    FDAN[105] 39.91/0.9686 37.75/0.9522 27.88/0.8508
    PP-MSVSR[92] 1.45 40.06/0.9699 37.54/0.9499 28.13/0.8604
    GOVSR[106] 40.14/0.9713 37.63/0.9503 28.41/0.8724
    ETDM[82] 8.4 40.11/0.9707 −/− 28.81/0.8725
    TTVSR[87] 50/− 6.8 40.41/0.9712 37.92/0.9526 28.40/0.8643
    BasicVSR++[83] 30/14 7.3 40.72/0.9722 38.21/0.9550 29.04/0.8753
    CFD-BasicVSR++[127] 30/7 7.5 40.77/0.9726 38.36/0.9557 29.14/0.8760
    TCNet[86] 9.6 −/− −/− 28.44/0.8730
    VRT[88] 16/7 35.6 41.05/0.9737 38.72/0.9584 29.42/0.8795
    CTVSR[89] 16/14 34.5 41.20/0.9740 38.83/0.9580 29.28/0.8811
    FTVSR++[90] 10.8 −/− −/− 28.80/0.8680
    LGDFNet-BPP[91] 9.0 40.81/0.9756 −/− 29.39/0.8798
    RVRT[93] 30/14 10.8 40.90/0.9729 38.59/0.9576 29.54/0.8810
    DFVSR[94] 7.1 40.97/0.9733 38.51/0.9571 29.56/0.8983
    MFPI[95] −/− 7.3 41.08/0.9741 38.70/0.9579 29.34/0.8781
    下载: 导出CSV

    表  4  真实场景下的VSR性能对比结果

    Table  4  Performance comparison of real-world video super-resolution algorithm

    对比方法 推理帧数 RealVSR MVSR $ 4\times $
    PSNR/SSIM/LPIPS PSNR/SSIM/LPIPS
    RSDN[103] 之前帧 23.91/0.7743/0.224 23.15/0.7533/0.279
    FSTRN[107] 7 23.36/0.7683/0.240 22.66/0.7433/0.315
    TOF[12] 7 23.62/0.7739/0.220 22.80/0.7502/0.279
    TDAN[51] 7 23.71/0.7737/0.229 23.07/0.7492/0.282
    EDVR[67] 7 23.96/0.7781/0.216 23.51/0.7611/0.268
    BasicVSR[75] 所有帧 24.00/0.7801/0.209 23.38/0.7594/0.270
    MANA[108] 所有帧 23.89/0.7781/0.224 23.15/0.7513/0.285
    TTVSR[87] 所有帧 24.08/0.7837/0.213 23.60/0.7686/0.277
    ETDM[82] 所有帧 24.13/0.7896/0.206 23.61/0.7662/0.260
    BasicVSR++[83] 所有帧 24.24/0.7933/0.216 23.70/0.7713/0.263
    RealBasicVSR[28] 所有帧 23.74/0.7676/0.174 23.15/0.7603/0.202
    EAVSR[30] 所有帧 24.20/0.7862/0.208 23.61/0.7618/0.264
    EAVSR+[30] 所有帧 24.41/0.7953/0.212 23.94/0.7726/0.259
    EAVSRGAN+[30] 所有帧 23.99/0.7726/0.170 23.35/0.7611/0.199
    下载: 导出CSV

    表  5  不同帧间对齐方式的性能和参数比较

    Table  5  Performance and parameter comparisons of different inter-frame alignment

    对齐方式 参数量(M) 插值方法 光流
    GT SpyNet
    显式对齐(光流) 1.35 最近邻插值 31.84 31.78
    双线性插值 31.92 31.85
    双三次插值 31.93 31.89
    混合对齐(光流引导 1.60 双线性插值 32.08 31.98
    的可变形卷积)
    混合对齐(光流引导 1.56 双线性插值 32.03 31.94
    的可变形注意力)
    混合对齐(光流引导 1.35 最近邻插值 31.81 31.82'
    的图像块对齐)
    混合对齐(光流引导 1.36 基于注意力的 32.14 32.05
    的隐式对齐) 隐式插值
    下载: 导出CSV

    表  6  GeForce RTX 3090平台下VSR的性能和推理时间对比结果

    Table  6  Performance and inference time comparisons of VSR algorithm on GeForce RTX 3090 platform

    对比方法参数量(M)推理时间(ms)对齐方式双三次插值下采样高斯模糊下采样
    REDS (RGB通道)Vimeo-90K-T
    (Y通道)
    Vid4
    (Y通道)
    Vimeo-90K-T
    (Y通道)
    Vid4
    (Y通道)
    UDM10
    (Y通道)
    Bicubic<126.23/0.731931.32/0.868423.78/0.637431.30/0.868721.80/0.534628.47/0.8253
    TOFlow[12]1.41250显式27.96/0.798133.08/0.905425.89/0.765134.62/0.921225.85/0.765936.26/0.9438
    DUF[56]5.8737.5无需28.63/0.8251−/−27.33/0.831936.87/0.944727.38/0.832938.48/0.9605
    EDVR[67]20.6188.2隐式31.09/0.880037.61/0.948927.35/0.826437.81/0.952327.85/0.850339.89/0.9686
    TMP[61]3.131.5隐式30.67/0.8710−/−27.10/0.816737.33/0.948127.61/0.8428−/−
    BasicVSR[75]6.345.4显式31.42/0.890937.18/0.945027.24/0.825137.53/0.949827.96/0.855339.96/0.9694
    ICONVSR[75]8.758.4显式31.67/0.894837.47/0.947627.39/0.827937.84/0.952428.04/0.857040.03/0.9694
    TTVSR[87]6.8123.3混合32.12/0.9021−/−−/−37.92/0.952628.40/0.864340.41/0.9712
    VRT[88]35.61679混合32.17/0.900238.20/0.953027.93/0.842538.72/0.958429.37/0.879241.04/0.9737
    BasicVSR++[83]7.360.2混合32.39/0.906937.79/0.950027.79/0.840038.21/0.955029.04/0.875340.72/0.9722
    PSRT[72]13.41280.2混合32.72/0.910638.27/0.953628.07/0.8485−/−−/−−/−
    MIA-VSR[97]16.51194.6无需32.78 0.922038.22/0.953228.20/0.8507−/−−/−−/−
    下载: 导出CSV
  • [1] Wan Z, Zhang B, Chen D, et al. Bringing old films back to life. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17694−17703
    [2] Li G, Ji J, Qin M, et al. Towards high-quality and efficient video super-resolution via spatial-temporal data overfitting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 10259−10269
    [3] Zhu H, Wei Y, Liang X, et al. CTP: Towards vision-language continual pretraining via compatible momentum contrast and topology preservation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023. 22257−22267
    [4] Jiao S, Wei Y, Wang Y, et al. Learning mask-aware clip representations for zero-shot segmentation. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS). New Orleans, USA: 2023. 35631−35653.
    [5] Liu C, Sun D. On Bayesian adaptive video super resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2): 346−360 doi: 10.1109/TPAMI.2013.127
    [6] Ma Z, Liao R, Tao X, et al. Handling motion blur in multi-frame super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE, 2015. 5224−5232
    [7] Wu Y, Li F, Bai H, et al. Bridging component learning with degradation modelling for blind image super-resolution. IEEE Transactions on Multimedia, DOI: 10.1109/TMM.2022.3216115
    [8] 张帅勇, 刘美琴, 姚超, 林春雨, 赵耀. 分级特征反馈融合的深度图像超分辨率重建. 自动化学报, 2022, 48(4): 992−1003

    Zhang Shuai-Yong, Liu Mei-Qin, Yao Chao, Lin Chun-Yu, Zhao Yao. Hierarchical feature feedback network for depth super-resolution reconstruction. Acta Automatica Sinica, 2022, 48(4): 992−1003
    [9] Charbonnier P, Blanc-Feraud L, Aubert G, et al. Two deterministic half-quadratic regularization algorithms for computed imaging. In: Proceedings of 1st International Conference on Image Processing (ICIP). Austin, USA: IEEE, 1994. 168−172
    [10] Lai W S, Huang J B, Ahuja N, et al. Fast and accurate image super-resolution with deep laplacian pyramid networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(11): 2599−2613
    [11] Zha L, Yang Y, Lai Z, et al. A lightweight dense connected approach with attention on single image super-resolution. Electronics, 2021, 10(11): 1234 doi: 10.3390/electronics10111234
    [12] Xue T, Chen B, Wu J, et al. Video enhancement with task-oriented flow. International Journal of Computer Vision, 2019, 127(8): 1106−1125 doi: 10.1007/s11263-018-01144-2
    [13] Liu C, Sun D. A bayesian approach to adaptive video super resolution. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado Springs, USA: IEEE, 2011. 209−216
    [14] Nah S, Baik S, Hong S, et al. Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Long Beach, USA: IEEE, 2019. 1996−2005
    [15] Protter M, Elad M, Takeda H, et al. Generalizing the nonlocal-means to super-resolution reconstruction. IEEE Transactions on Image Processing, 2008, 18(1): 36−51
    [16] O. Shahar, A. Faktor, and M. Irani, Space-time super-resolution from a single video. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition (CVPR). Colorado Springs, USA: IEEE, 2011. 3353−3360
    [17] Li D, Wang Z. Video superresolution via motion compensation and deep residual learning. IEEE Transactions on Computational Imaging, 2017, 3(4): 749−762 doi: 10.1109/TCI.2017.2671360
    [18] Venice[Online], available: https://www.harmonicinc.com/free-4k-demo-footage/, May 1, 2017
    [19] Myanmar 60p, Harmonic Inc. [Online], available: http://www.harmonicinc.com/resources/videos/4k-video-clip-center, May 1, 2017
    [20] ITS, "Consumer digital video library''[Online], available: https://www.cdvl.org, March 20, 2024
    [21] Mercat A, Viitanen M, Vanne J. UVG dataset: 50/120fps 4K sequences for video codec analysis and development. In: Proceedings of the ACM Multimedia Systems Conference. Istanbul, Turkey: ACM, 2020. 297−302
    [22] Liu D, Wang Z, Fan Y, et al. Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2507−2515
    [23] Li D, Wang Z. Video superresolution via motion compensation and deep residual learning. IEEE Transactions on Computational Imaging, 2017, 3(4): 749−762 doi: 10.1109/TCI.2017.2671360
    [24] Wang Z, Yi P, Jiang K, et al. Multi-memory convolutional neural network for video super-resolution. IEEE Transactions on Image Processing, 2018, 28(5): 2530−2544
    [25] Yi P, Wang Z, Jiang K, et al. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea: IEEE, 2019. 3106−3115
    [26] Yu J, Liu J, Bo L, et al. Memory-augmented non-local attention for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17834−17843
    [27] Yang X, Xiang W, Zeng H, et al. Real-world video super-resolution: A benchmark dataset and a decomposition based learning scheme. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE, 2021. 4781−4790
    [28] Chan K C K, Zhou S, Xu X, et al. Investigating tradeoffs in real-world video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 5962−5971
    [29] Lee J, Lee M, Cho S, et al. Reference-based video super-resolution using multi-camera video triplets. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17824−17833
    [30] Wang R, Liu X, Zhang Z, et al. Benchmark dataset and effective inter-frame alignment for real-world video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 1168−1177
    [31] Huang Y, Dong H, Pan J, et al. Boosting video super resolution with patch-based temporal redundancy optimization. In: Proceedings of International Conference on Artificial Neural Networks (ICANN). Heraklion, Greece: Springer, 2023. 362−375
    [32] Zhou S, Yang P, Wang J, et al. Upscale-A-Video: Temporal-consistent diffusion model for real-world video super-resolution. arXiv preprint arXiv: 2312.06640, 2023.
    [33] Wang X, Xie L, Dong C, et al. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal, Canada: IEEE, 2021. 1905−1914
    [34] Singh A, Singh J. Survey on single image based super-resolution—implementation challenges and solutions. Multimedia Tools and Applications, 2020, 79(3−5): 1641−1672
    [35] You Z, Li Z, Gu J, et al. Depicting beyond scores: Advancing image quality assessment through multi-modal language models. arXiv preprint arXiv: 2312.08962, 2023.
    [36] You Z, Gu J, Li Z, et al. Descriptive image quality assessment in the wild. arXiv preprint arXiv: 2405.18842, 2024.
    [37] Xie L, Wang X, Zhang H, et al. VFHQ: A high-quality dataset and benchmark for video face super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 657−666
    [38] Zhou F, Sheng W, Lu Z, et al. A database and model for the visual quality assessment of super-resolution videos. IEEE Transactions on Broadcasting, 2024, 70(2): 516−532 doi: 10.1109/TBC.2024.3382949
    [39] Jin J, Zhang X, Fu X, et al. Just noticeable difference for deep machine vision. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32(6): 3452−3461
    [40] Kappeler A, Yoo S, Dai Q, et al. Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging, 2016, 2(2): 109−122 doi: 10.1109/TCI.2016.2532323
    [41] Lucas A, Lopez-Tapia S, Molina R, et al. Generative adversarial networks and perceptual losses for video super-resolution. IEEE Transactions on Image Processing, 2019, 28(7): 3312−3327 doi: 10.1109/TIP.2019.2895768
    [42] Caballero J, Ledig C, Aitken A, et al. Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 4778−4787
    [43] Kim S Y, Lim J, Na T, et al. 3DSRNet: video super-resolution using 3D convolutional neural networks. arXiv preprint arXiv: 1812.09079, 2018.
    [44] Li D, Liu Y, Wang Z. Video super-resolution using non-simultaneous fully recurrent convolutional network. IEEE Transactions on Image Processing, 2018, 28(3): 1342−1355
    [45] Haris M, Shakhnarovich G, Ukita N. Space-time-aware multi-resolution video enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 2859−2868
    [46] Bao W, Lai W S, Zhang X, et al. MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(3): 933−948
    [47] Zhu X, Li Z, Lou J, et al. Video super-resolution based on a spatio-temporal matching network. Pattern Recognition, 2021, 110: 107619 doi: 10.1016/j.patcog.2020.107619
    [48] Wang L, Guo Y, Liu L, et al. Deep video super-resolution using HR optical flow estimation. IEEE Transactions on Image Processing, 2020, 29: 4323−4336 doi: 10.1109/TIP.2020.2967596
    [49] Zhu X, Li Z, Zhang X Y, et al. Residual invertible spatio-temporal network for video super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Honolulu, USA: AAAI, 2019. 5981−5988
    [50] Bare B, Yan B, Ma C, et al. Real-time video super-resolution via motion convolution kernel estimation. Neurocomputing, 2019, 367: 236−245 doi: 10.1016/j.neucom.2019.07.089
    [51] Tian Y, Zhang Y, Fu Y, et al. TDAN: Temporally-deformable alignment network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 3360−3369
    [52] Ying X, Wang L, Wang Y, et al. Deformable 3D convolution for video super-resolution. IEEE Signal Processing Letters, 2020, 27: 1500−1504 doi: 10.1109/LSP.2020.3013518
    [53] Yan B, Lin C, Tan W. Frame and feature-context video super-resolution. In: Proceedings of the 33th AAAI Conference on Artificial Intelligence (AAAI). Honolulu, USA: AAAI, 2019: 5597−5604
    [54] Liu S, Zheng C, Lu K, et al. Evsrnet: Efficient video super-resolution with neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 2480−2485
    [55] Rota C, Buzzelli M, van de Weijer J. Enhancing perceptual quality in video super-resolution through temporally-consistent detail synthesis using diffusion models. arXiv preprint arXiv: 2311.15908, 2023.
    [56] Jo Y, Oh S W, Kang J, et al. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE, 2018. 3224−3232
    [57] Yi P, Wang Z, Jiang K, et al. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea: IEEE, 2019. 3106−3115
    [58] Sun W, Sun J, Zhu Y, et al. Video super-resolution via dense non-local spatial-temporal convolutional network. Neurocomputing, 2020, 403: 1−12 doi: 10.1016/j.neucom.2020.04.039
    [59] Haris M, Shakhnarovich G, Ukita N. Recurrent back-projection network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 3897−3906
    [60] Liu H, Zhao P, Ruan Z, et al. Large motion video super-resolution with dual subnet and multi-stage communicated upsampling. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Virtual Event: AAAI, 2021. 2127−2135
    [61] Zhang Z, Li R, Guo S, et al. TMP: Temporal motion propagation for online Video super-sesolution. arXiv preprint arXiv: 2312.09909, 2023.
    [62] Li W, Tao X, Guo T, et al. Mucan: Multi-correspondence aggregation network for video super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV). Glasgow, UK: Springer, 2020. 335−351
    [63] Song H, Xu W, Liu D, et al. Multi-stage feature fusion network for video super-resolution. IEEE Transactions on Image Processing, 2021, 30: 2923−2934 doi: 10.1109/TIP.2021.3056868
    [64] Fuoli D, Danelljan M, Timofte R, et al. Fast online video super-resolution with deformable attention pyramid. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2023. 1735−1744
    [65] Kalarot R, Porikli F. Multiboot VSR: Multi-stage multi-reference bootstrapping for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Long Beach, USA: IEEE, 2019. 2060−2069
    [66] Xia B, He J, Zhang Y, et al. Structured sparsity learning for efficient video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023: 22638−22647
    [67] Wang X, Chan K C K, Yu K, et al. EDVR: Video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Long Beach, USA: IEEE, 2019. 1954−1963
    [68] Fuoli D, Gu S, Timofte R. Efficient video super-resolution through recurrent latent space propagation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul, Korea: IEEE, 2019. 3476−3485
    [69] Isobe T, Li S, Jia X, et al. Video super-resolution with temporal group attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 8008−8017
    [70] Jin S, Liu M, Yao C, et al. Kernel Dimension Matters: To activate available kernels for real-time video super-resolution. In: Proceedings of the ACM International Conference on Multimedia (ACM MM). Ottawa, Canada: ACM, 2023. 8617−8625
    [71] Cao J, Li Y, Zhang K, et al. Video super-resolution transformer. arXiv preprint arXiv: 2106.06847, 2021.
    [72] Shi S, Gu J, Xie L, et al. Rethinking alignment in video super-resolution transformers. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS). New Orleans, USA: 2022. 36081−36093
    [73] Tang Q, Zhao Y, Liu M, et al. SeeClear: Semantic distillation enhances pixel condensation for video super-resolution. arXiv preprint arXiv: 2410.05799, 2024.
    [74] Huang C, Li J, Chu L, et al. Disentangle propagation and restoration for efficient video recovery. In: Proceedings of the ACM International Conference on Multimedia (ACM MM). Ottawa, Canada: ACM, 2023. 8336−8345
    [75] Chan K C K, Wang X, Yu K, et al. Basicvsr: The search for essential components in video super-resolution and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 4947−4956
    [76] Chen Z, Long F, Qiu Z, et al. Learning spatial adaptation and temporal coherence in diffusion models for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 9232−9241
    [77] Leng J, Wang J, Gao X, et al. Icnet: Joint alignment and reconstruction via iterative collaboration for video super-resolution. In: Proceedings of the ACM International Conference on Multimedia (ACM MM). Lisboa, Portugal: ACM, 2022. 6675−6684
    [78] Yi P, Wang Z, Jiang K, et al. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(5): 2264−2280
    [79] Zhang F, Chen G, Wang H, et al. Multi-scale video super-resolution transformer with polynomial approximation. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 4496−4506 doi: 10.1109/TCSVT.2023.3278131
    [80] Qiu Z, Yang H, Fu J, et al. Learning spatiotemporal frequency-transformer for compressed video super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV). Tel Aviv, Israel: Springer, 2022. 257−273
    [81] Jiang Y, Chan K C K, Wang X, et al. Reference-based image and video super-resolution via C2-matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(7): 8874−8887
    [82] Isobe T, Jia X, Tao X, et al. Look back and forth: Video super-resolution with explicit temporal difference modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17411−17420
    [83] Chan K C K, Wang X, Yu K, et al. Basicvsr: The search for essential components in video super-resolution and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 4947−4956
    [84] Zhou K, Li W, Lu L, et al. Revisiting temporal alignment for video restoration. In: Proceedings/CVF of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 6053−6062
    [85] Tang Q, Zhao Y, Liu M, et al. Semantic lens: Instance-centric semantic alignment for video super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Vancouver, Canada: AAAI, 2024. 5154−5161
    [86] Liu M, Jin S, Yao C, et al. Temporal consistency learning of inter-frames for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 33(4): 1507−1520
    [87] Liu C, Yang H, Fu J, et al. Learning trajectory-aware transformer for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 5687−5696
    [88] Liang J, Cao J, Fan Y, et al. VRT: A video restoration transformer. IEEE Transactions on Image Processing, 2024, 33: 2171−2182 doi: 10.1109/TIP.2024.3372454
    [89] Tang J, Lu C, Liu Z, et al. CTVSR: Collaborative spatial-temporal transformer for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, DOI: 10.1109/TCSVT.2023.3340439
    [90] Qiu Z, Yang H, Fu J, et al. Learning degradation-robust spatiotemporal frequency-transformer for video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(12): 14888−14904 doi: 10.1109/TPAMI.2023.3312166
    [91] Zhang C, Wang X, Xiong R, et al. Local-global dynamic filtering network for video super-resolution. IEEE Transactions on Computational Imaging, 2023, 9: 963−976 doi: 10.1109/TCI.2023.3321980
    [92] Jiang L, Wang N, Dang Q, et al. PP-MSVSR: multi-stage video super-resolution. arXiv preprint arXiv: 2112.02828, 2021.
    [93] Liang J, Fan Y, Xiang X, et al. Recurrent video restoration transformer with guided deformable attention. Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS). New Orleans, USA: 2022. 378−393
    [94] Dong S, Lu F, Wu Z, et al. DFVSR: directional frequency video super-resolution via asymmetric and enhancement alignment network. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). Macao, China: IJCAI, 2023. 681−689
    [95] Li F, Zhang L, Liu Z, et al. Multi-frequency representation enhancement with privilege information for video super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023. 12814−12825
    [96] Kai D, Lu J, Zhang Y, et al. EvTexture: Event-driven texture enhancement for video super-resolution. arXiv preprint arXiv: 2406.13457, 2024.
    [97] Zhou X, Zhang L, Zhao X, et al. Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention. arXiv preprint arXiv: 2401.06312, 2024.
    [98] Xu K, Yu Z, Wang X, et al. An implicit alignment for video super-resolution. arXiv preprint arXiv: 2305.00163, 2023.
    [99] Huang Y, Wang W, Wang L. Video super-resolution via bidirectional recurrent convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 1015−1028
    [100] Chu M, Xie Y, Mayer J, et al. Learning temporal coherence via self-supervision for GAN-based video generation. ACM Transactions on Graphics, 2020, 39(4): 75
    [101] Isobe T, Zhu F, Jia X, et al. Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv: 2008.05765, 2020.
    [102] Sajjadi M S M, Vemulapalli R, Brown M. Frame-recurrent video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE, 2018. 6626−6634
    [103] Isobe T, Jia X, Gu S, et al. Video super-resolution with recurrent structure-detail network. In: Proceedings of the European Conference on Computer Vision (ECCV). Glasgow, UK: Springer, 2020. 645−660
    [104] Baniya A A, Lee T K, Eklund P W, et al. Online video super-resolution using information replenishing unidirectional recurrent model. Neurocomputing, 2023, 546: 126355 doi: 10.1016/j.neucom.2023.126355
    [105] Lin J, Huang Y, Wang L. FDAN: Flow-guided deformable alignment network for video super-resolution. arXiv preprint arXiv: 2105.05640, 2021.
    [106] Yi P, Wang Z, Jiang K, et al. Omniscient video super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE, 2021. 4429−4438
    [107] Li S, He F, Du B, et al. Fast spatio-temporal residual network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 10522−10531
    [108] Yu J, Liu J, Bo L, et al. Memory-augmented non-local attention for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17834−17843
    [109] Tao X, Gao H, Liao R, et al. Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 4472−4480
    [110] Yang X, He C, Ma J, et al. Motion-guided latent diffusion for temporally consistent real-world video super-resolution. arXiv preprint arXiv: 2312.00853, 2023.
    [111] Liu H, Ruan Z, Zhao P, et al. Video super-resolution based on deep learning: a comprehensive survey. Artificial Intelligence Review, 2022, 55(8): 5981−6035 doi: 10.1007/s10462-022-10147-y
    [112] Tu Z, Li H, Xie W, et al. Optical flow for video super-resolution: A survey. Artificial Intelligence Review, 2022, 55(8): 6505−6546 doi: 10.1007/s10462-022-10159-8
    [113] Baniya A A, Lee G, Eklund P, et al. A methodical study of deep learning based video super-resolution. Authorea Preprints, DOI: 10.36227/techrxiv.23896986.v1
    [114] 江俊君, 程豪, 李震宇, 刘贤明, 王中元. 深度学习视频超分辨率技术概述. 中国图象图形学报, 2023, 28(7): 1927−1964 doi: 10.11834/jig.220130

    Jiang Jun-Jun, Cheng Hao, Li Zhen-Yu, Liu Xian-Ming, Wang Zhong-Yuan. Deep learning based video-related super-resolution technique: A survey. Journal of Image and Graphics, 2023, 28(7): 1927−1964 doi: 10.11834/jig.220130
    [115] Dong C, Loy C C, He K, et al. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(2): 295−307
    [116] Drulea M, Nedevschi S. Total variation regularization of local-global optical flow. In: Proceedings of the International IEEE Conference on Intelligent Transportation Systems (ITSC). Washington, USA: IEEE, 2011. 318−323
    [117] Haris M, Shakhnarovich G, Ukita N. Deep back-projection networks for super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE, 2018. 1664−1673
    [118] Dai J, Qi H, Xiong Y, et al. Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 764−773
    [119] Zhu X, Hu H, Lin S, et al. Deformable convnets v2: More deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 9308−9316
    [120] Chan K C K, Wang X, Yu K, et al. Understanding deformable alignment in video super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Virtual Event: AAAI, 2021: 973−981
    [121] Butler D J, Wulff J, Stanley G B, et al. A naturalistic open source movie for optical flow evaluation. In: Proceedings of European Conference on Computer Vision (ECCV). Florence, Italy: Springer, 2012. 611−625
    [122] Lian W, Lian W. Sliding window recurrent network for efficient video super-resolution. In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW). Tel Aviv, Israel: Springer Nature Switzerland, 2022. 591−601
    [123] Xiao J, Jiang X, Zheng N, et al. Online video super-resolution with convolutional kernel bypass grafts. IEEE Transactions on Multimedia, 2023, 25: 8972−8987 doi: 10.1109/TMM.2023.3243615
    [124] Li D, Shi X, Zhang Y, et al. A simple baseline for video restoration with grouped spatial-temporal shift. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 9822−9832
    [125] Geng Z, Liang L, Ding T, et al. Rstt: Real-time spatial temporal transformer for space-time video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 17441−17451
    [126] Lin L, Wang X, Qi Z, et al. Accelerating the training of video super-resolution models. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Washington, USA: AAAI, 2023. 1595−1603
    [127] Li H, Chen X, Dong J, et al. Collaborative feedback discriminative propagation for video super-resolution. arXiv preprint arXiv: 2404.04745, 2024.
    [128] Hu M, Jiang K, Wang Z, et al. Cycmunet+: Cycle-projected mutual learning for spatial-temporal video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(11): 13376−13392
    [129] Xiao Y, Yuan Q, Jiang K, et al. Local-global temporal difference learning for satellite video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(4): 2789−2802 doi: 10.1109/TCSVT.2023.3312321
    [130] Hui Y, Liu Y, Liu Y, et al. VJT: A video transformer on joint tasks of deblurring, low-light enhancement and denoising. arXiv preprint arXiv: 2401.14754, 2024.
    [131] Song Y, Wang M, Yang Z, et al. NegVSR: Augmenting negatives for generalized noise modeling in real-world video super-resolution. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Vancouver, Canada: AAAI, 2024. 10705−10713
    [132] Wang Y, Isobe T, Jia X, et al. Compression-aware video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 2012−2021
    [133] Youk G, Oh J, Kim M. FMA-Net: Flow-guided dynamic filtering and iterative feature refinement with multi-attention for joint video super-resolution and deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 44−55
    [134] Zhang Y, Yao A. RealViformer: Investigating attention for real-world video super-resolution. arXiv preprint arXiv: 2407.13987, 2024.
    [135] Xiang X, Tian Y, Zhang Y, et al. Zooming slow-mo: Fast and accurate one-stage space-time video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020: 3370−3379
    [136] Jeelani M, Cheema N, Illgner-Fehns K, et al. Expanding synthetic real-world degradations for blind video super resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver, Canada: IEEE, 2023. 1199−1208
    [137] Bai H, Pan J. Self-supervised deep blind video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(7): 4641−4653 doi: 10.1109/TPAMI.2024.3361168
    [138] Pan J, Bai H, Dong J, et al. Deep blind video super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR). Seattle, USA: IEEE, 2024. 4811−4820
    [139] Chen H, Li W, Gu J, et al. Low-res leads the way: Improving generalization for super-resolution by self-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 25857−25867
    [140] Yuan J, Ma J, Wang B, et al. Content-decoupled contrastive learning-based implicit degradation modeling for blind image super-resolution. arXiv preprint arXiv: 2408.05440, 2024.
    [141] Chen Y H, Chen S C, Lin Y Y, et al. MoTIF: Learning motion trajectories with local implicit neural functions for continuous space-time video super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023. 23131−23141
    [142] Huang C, Li J, Chu L, et al. Arbitrary-scale video super-resolution guided by dynamic context. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Vancouver, Canada: AAAI, 2024. 2294−2302
    [143] Li Z, Liu H, Shang F, et al. SAVSR: Arbitrary-scale video super-resolution via a learned scale-adaptive network. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). Vancouver, Canada: AAAI, 2024. 3288−3296
    [144] Huang Z, Huang A, Hu X, et al. Scale-adaptive feature aggregation for efficient space-time video super-resolution. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2024. 4228−4239
    [145] Xu Y, Park T, Zhang R, et al. VideoGigaGAN: Towards detail-rich video super-resolution. arXiv preprint arXiv: 2404.12388, 2024.
    [146] He Q, Wang S, Liu T, et al. Enhancing measurement precision for rotor vibration displacement via a progressive video super resolution network. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 1−13
    [147] Chang J, Zhao Z, Jia C, et al. Conceptual compression via deep structure and texture synthesis. IEEE Transactions on Image Processing, 2022, 31: 2809−2823 doi: 10.1109/TIP.2022.3159477
    [148] Chang J, Zhang J, Li J, et al. Semantic-aware visual decomposition for image coding. International Journal of Computer Vision, 2023, 131(9): 2333−2355 doi: 10.1007/s11263-023-01809-7
    [149] Ren B, Li Y, Liang J, et al. Sharing key semantics in transformer makes efficient image restoration. arXiv preprint arXiv: 2405.20008, 2024.
    [150] Wu R, Sun L, Ma Z, et al. One-step effective diffusion network for real-world image super-resolution. arXiv preprint arXiv: 2406.08177, 2024.
    [151] Sun H, Li W, Liu J, et al. Coser: Bridging image and language for cognitive super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 25868−25878
    [152] Wu R, Yang T, Sun L, et al. Seesr: Towards semantics-aware real-world image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 25456−25467
    [153] Zhang Y, Zhang H, Chai X, et al. MRIR: Integrating multimodal insights for diffusion-based realistic image restoration. arXiv preprint arXiv: 2407.03635, 2024.
    [154] Zhang Y, Zhang H, Chai X, et al. Diff-restorer: Unleashing visual prompts for diffusion-based universal image restoration. arXiv preprint arXiv: 2407.03636, 2024.
    [155] Ouyang H, Wang Q, Xiao Y, et al. Codef: Content deformation fields for temporally consistent video processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2024: 8089−8099
    [156] Hu J, Gu J, Yu S, et al. Interpreting low-level vision models with causal effect maps. arXiv preprint arXiv: 2407.19789, 2024.
    [157] Gu J, Dong C. Interpreting super-resolution networks with local attribution maps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Virtual: IEEE, 2021. 9199−9208
    [158] Cao J, Liang J, Zhang K, et al. Towards interpretable video super-resolution via alternating optimization. In: Proceedings of the European Conference on Computer Vision (ECCV). Tel Aviv, Israel: Springer, 2022. 393−411
  • 加载中
计量
  • 文章访问数:  76
  • HTML全文浏览量:  70
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-04-29
  • 录用日期:  2024-10-16
  • 网络出版日期:  2025-03-06

目录

    /

    返回文章
    返回