模型的参数量与FLOPS计算

  • 模型的参数量
    卷积层计算
    模 型 的 参 数 量 = [ 卷 积 核 的 长 ∗ 卷 积 核 的 宽 ∗ 卷 积 核 的 高 ( 即 通 道 , 由 上 一 层 的 输 出 通 道 决 定 ) ] ∗ 卷 积 核 的 数 量 + 偏 置 参 数 ( 其 等 于 卷 积 核 的 数 量 ) 模型的参数量 =[卷积核的长 * 卷积核的宽 * 卷积核的高(即通道,由上一层的输出通道决定)]*\\ 卷积核的数量 + 偏置参数(其等于卷积核的数量) =[]+
    全连接层
    由于不存在权值共享,它的FLOPs数目即是该层参数数目: N i n ∗ N o u t + N o u t N_{in}∗N_{out}+N_{out} NinNout+Nout​ 。

  • 模型FLOPs
    卷积层计算
    F L O P s 数 量 = 参 数 量 ∗ 该 层 输 出 特 征 图 的 大 小 该 层 输 出 特 征 图 的 大 小 : h ∗ w FLOPs数量 = 参数量 * 该层输出特征图的大小 \\ 该层输出特征图的大小: h * w FLOPs=:hw


    全连接层 由于不存在权值共享,它的FLOPs数目即是该层参数数目:
    N i n ∗ N o u t + N o u t N_{in}∗N_{out}+N_{out} NinNout+Nout

深度学习中parameters个数和FLOPS计算(以CNN中经典的AlexNet网络结构为例) - Never-Giveup的博客 - CSDN博客
https://blog.csdn.net/qq_36653505/article/details/86700885


大话CNN经典模型:AlexNet - 雪饼的个人空间 - OSCHINA
https://my.oschina.net/u/876354/blog/1633143

  • 矩阵乘法次数的计算过程
    矩阵乘法次数的计算过程 - Kellbook的博客 - CSDN博客 https://blog.csdn.net/qq_30622831/article/details/82730986

在计算FLOPs中,有

        kernel_ops = self.kernel_size[0] * self.kernel_size[1] * (self.in_channels / self.groups) 

rethinking-network-pruning/compute_flops.py at master · Eric-mingjie/rethinking-network-pruning https://github.com/Eric-mingjie/rethinking-network-pruning/blob/master/cifar/l1-norm-pruning/compute_flops.py

其中,self.group
pytorch的函数中的group参数的作用 - 慢行厚积 - 博客园 https://www.cnblogs.com/wanghui-garcia/p/10775851.html

例子1(VGG16 on imagenet)

图中计算,忽略了偏置参数(其等于卷积核的数量)
在这里插入图片描述

  • LayerID1卷积核计算
    参数量
    C o n v 1 _ 1 = [ ( L a y e r I D 1 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 0 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 1 , O u t p u t S i z e ( C h a n n e l ) ) = ( 3 ∗ 3 ∗ 3 ) ∗ 64 = 1728 Conv1\_1 =\\ [ (LayerID1, Patch Size(Height)) *\\ (LayerID1, Patch Size(Width))* \\ (LayerID0, Output Size(Channel)) ]*\\ (LayerID1, Output Size(Channel))\\ = (3*3*3)* 64 \\ =1728 Conv1_1=[(LayerID1,PatchSize(Height))(LayerID1,PatchSize(Width))(LayerID0,OutputSize(Channel))](LayerID1,OutputSize(Channel))=(33364=1728


    FLOPs计算
    C o n v 1 _ 1 = [ ( L a y e r I D 1 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 0 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 1 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , O u t p u t S i z e ( H e i g h t ) ) = ( 3 ∗ 3 ∗ 3 ) ∗ 64 ∗ 224 ∗ 224 = 1728 ∗ 224 ∗ 224 = 86 , 704 , 128 Conv1\_1 =\\ [ (LayerID1, Patch Size(Height)) *\\ (LayerID1, Patch Size(Width))* \\ (LayerID0, Output Size(Channel)) ]*\\ (LayerID1, Output Size(Height))*\\ (LayerID1, Output Size(Height))\\ = (3*3*3)* 64 *224*224\\ =1728 * 224*224 \\ = 86,704,128 Conv1_1=[(LayerID1,PatchSize(Height))(LayerID1,PatchSize(Width))(LayerID0,OutputSize(Channel))](LayerID1,OutputSize(Height))(LayerID1,OutputSize(Height))=(33364224224=1728224224=86,704,128

  • LayerID12卷积核计算
    参数量
    C o n v 4 _ 1 = [ ( L a y e r I D 12 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 11 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 12 , O u t p u t S i z e ( C h a n n e l ) ) = ( 3 ∗ 3 ∗ 256 ) ∗ 512 = 1 , 179 , 648 ‬ Conv4\_1 =\\ [ (LayerID12, Patch Size(Height)) *\\ (LayerID12, Patch Size(Width))* \\ (LayerID11, Output Size(Channel)) ]*\\ (LayerID12, Output Size(Channel))\\ = (3*3*256)* 512\\ =1,179,648‬ Conv4_1=[(LayerID12,PatchSize(Height))(LayerID12,PatchSize(Width))(LayerID11,OutputSize(Channel))](LayerID12,OutputSize(Channel))=(33256512=1,179,648


    FLOPs计算
    C o n v 4 _ 1 = [ ( L a y e r I D 12 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 11 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 12 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , O u t p u t S i z e ( H e i g h t ) ) = ( 3 ∗ 3 ∗ 256 ) ∗ 512 ∗ 28 ∗ 28 = 1 , 179 , 648 ‬ ∗ 28 ∗ 28 = ‭ 924 , 844 , 032 ‬ Conv4\_1 =\\ [ (LayerID12, Patch Size(Height)) *\\ (LayerID12, Patch Size(Width))* \\ (LayerID11, Output Size(Channel)) ]*\\ (LayerID12, Output Size(Height))*\\ (LayerID12, Output Size(Height))\\ = (3*3*256)* 512 *28*28\\ =1,179,648‬ * 28*28\\ = ‭924,844,032‬ Conv4_1=[(LayerID12,PatchSize(Height))(LayerID12,PatchSize(Width))(LayerID11,OutputSize(Channel))](LayerID12,OutputSize(Height))(LayerID12,OutputSize(Height))=(332565122828=1,179,6482828=924,844,032

  • LayerID22全连接层计算
    参数量
    F C 1 = [ ( L a y e r I D 21 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 7 ∗ 7 ∗ 512 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = ‭ 102 , 760 , 448 ‬ FC1 =\\ [ (LayerID21, Output Size(Height)) *\\ (LayerID21, Output Size(Width))* \\ (LayerID21, Output Size(Channel)) ]*\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ] \\ = [(7*7*512) ]* [1*1*4096]\\ =‭102,760,448‬ FC1=[(LayerID21,OutputSize(Height))(LayerID21,OutputSize(Width))(LayerID21,OutputSize(Channel))][(LayerID22,OutputSize(Height))(LayerID22,OutputSize(Width))(LayerID22,OutputSize(Channel))]=[(77512)][114096]=102,760,448


    FLOPs计算
    F C 1 = [ ( L a y e r I D 21 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 7 ∗ 7 ∗ 512 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = ‭ 102 , 760 , 448 ‬ FC1 =\\ [ (LayerID21, Output Size(Height)) *\\ (LayerID21, Output Size(Width))* \\ (LayerID21, Output Size(Channel)) ]*\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]\\ = [(7*7*512) ]* [1*1*4096]\\ =‭102,760,448‬ FC1=[(LayerID21,OutputSize(Height))(LayerID21,OutputSize(Width))(LayerID21,OutputSize(Channel))][(LayerID22,OutputSize(Height))(LayerID22,OutputSize(Width))(LayerID22,OutputSize(Channel))]=[(77512)][114096]=102,760,448

  • LayerID23全连接层计算
    参数量
    F C 1 = [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 23 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 1 ∗ 1 ∗ 4096 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = ‭ ‭ 16 , 777 , 216 ‬ FC1 =\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]*\\ [ (LayerID23, Output Size(Height)) *\\ (LayerID23, Output Size(Width))* \\ (LayerID23, Output Size(Channel)) ]\\ = [(1*1*4096) ]* [1*1*4096]\\ =‭‭16,777,216‬ FC1=[(LayerID22,OutputSize(Height))(LayerID22,OutputSize(Width))(LayerID22,OutputSize(Channel))][(LayerID23,OutputSize(Height))(LayerID23,OutputSize(Width))(LayerID23,OutputSize(Channel))]=[(114096)][114096]=16,777,216


    FLOPs计算
    F C 1 = [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 23 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 1 ∗ 1 ∗ 4096 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = ‭ ‭ 16 , 777 , 216 ‬ FC1 =\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]*\\ [ (LayerID23, Output Size(Height)) *\\ (LayerID23, Output Size(Width))* \\ (LayerID23, Output Size(Channel)) ]\\ = [(1*1*4096) ]* [1*1*4096]\\ =‭‭16,777,216‬ FC1=[(LayerID22,OutputSize(Height))(LayerID22,OutputSize(Width))(LayerID22,OutputSize(Channel))][(LayerID23,OutputSize(Height))(LayerID23,OutputSize(Width))(LayerID23,OutputSize(Channel))]=[(114096)][114096]=16,777,216

Logo

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。

更多推荐