模型参数量与FLOPs计算
模型的参数量与FLOPS计算模型的参数量卷积层计算模型的参数量=[卷积核的长∗卷积核的宽∗卷积核的高(即通道,由上一层的输出通道决定)]∗卷积核的数量+偏置参数(其等于卷积核的数量)模型的参数量 =[卷积核的长 * 卷积核的宽 * 卷积核的高(即通道,由上一层的输出通道决定)]*\\卷积核的数量 + 偏置参数(其等于卷积核的数量)模型的参数量=[卷积核的长∗卷积核的宽∗卷积核的高(即通...
模型的参数量与FLOPS计算
-
模型的参数量
卷积层计算
模 型 的 参 数 量 = [ 卷 积 核 的 长 ∗ 卷 积 核 的 宽 ∗ 卷 积 核 的 高 ( 即 通 道 , 由 上 一 层 的 输 出 通 道 决 定 ) ] ∗ 卷 积 核 的 数 量 + 偏 置 参 数 ( 其 等 于 卷 积 核 的 数 量 ) 模型的参数量 =[卷积核的长 * 卷积核的宽 * 卷积核的高(即通道,由上一层的输出通道决定)]*\\ 卷积核的数量 + 偏置参数(其等于卷积核的数量) 模型的参数量=[卷积核的长∗卷积核的宽∗卷积核的高(即通道,由上一层的输出通道决定)]∗卷积核的数量+偏置参数(其等于卷积核的数量)
全连接层
由于不存在权值共享,它的FLOPs数目即是该层参数数目: N i n ∗ N o u t + N o u t N_{in}∗N_{out}+N_{out} Nin∗Nout+Nout 。 -
模型FLOPs
卷积层计算
F L O P s 数 量 = 参 数 量 ∗ 该 层 输 出 特 征 图 的 大 小 该 层 输 出 特 征 图 的 大 小 : h ∗ w FLOPs数量 = 参数量 * 该层输出特征图的大小 \\ 该层输出特征图的大小: h * w FLOPs数量=参数量∗该层输出特征图的大小该层输出特征图的大小:h∗w
全连接层 由于不存在权值共享,它的FLOPs数目即是该层参数数目:
N i n ∗ N o u t + N o u t N_{in}∗N_{out}+N_{out} Nin∗Nout+Nout
深度学习中parameters个数和FLOPS计算(以CNN中经典的AlexNet网络结构为例) - Never-Giveup的博客 - CSDN博客
https://blog.csdn.net/qq_36653505/article/details/86700885
大话CNN经典模型:AlexNet - 雪饼的个人空间 - OSCHINA
https://my.oschina.net/u/876354/blog/1633143
- 矩阵乘法次数的计算过程
矩阵乘法次数的计算过程 - Kellbook的博客 - CSDN博客 https://blog.csdn.net/qq_30622831/article/details/82730986
在计算FLOPs中,有
kernel_ops = self.kernel_size[0] * self.kernel_size[1] * (self.in_channels / self.groups)
见
rethinking-network-pruning/compute_flops.py at master · Eric-mingjie/rethinking-network-pruning https://github.com/Eric-mingjie/rethinking-network-pruning/blob/master/cifar/l1-norm-pruning/compute_flops.py
其中,self.group
pytorch的函数中的group参数的作用 - 慢行厚积 - 博客园 https://www.cnblogs.com/wanghui-garcia/p/10775851.html
例子1(VGG16 on imagenet)
图中计算,忽略了偏置参数(其等于卷积核的数量)
-
LayerID1卷积核计算
参数量
C o n v 1 _ 1 = [ ( L a y e r I D 1 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 0 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 1 , O u t p u t S i z e ( C h a n n e l ) ) = ( 3 ∗ 3 ∗ 3 ) ∗ 64 = 1728 Conv1\_1 =\\ [ (LayerID1, Patch Size(Height)) *\\ (LayerID1, Patch Size(Width))* \\ (LayerID0, Output Size(Channel)) ]*\\ (LayerID1, Output Size(Channel))\\ = (3*3*3)* 64 \\ =1728 Conv1_1=[(LayerID1,PatchSize(Height))∗(LayerID1,PatchSize(Width))∗(LayerID0,OutputSize(Channel))]∗(LayerID1,OutputSize(Channel))=(3∗3∗3)∗64=1728
FLOPs计算
C o n v 1 _ 1 = [ ( L a y e r I D 1 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 0 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 1 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 1 , O u t p u t S i z e ( H e i g h t ) ) = ( 3 ∗ 3 ∗ 3 ) ∗ 64 ∗ 224 ∗ 224 = 1728 ∗ 224 ∗ 224 = 86 , 704 , 128 Conv1\_1 =\\ [ (LayerID1, Patch Size(Height)) *\\ (LayerID1, Patch Size(Width))* \\ (LayerID0, Output Size(Channel)) ]*\\ (LayerID1, Output Size(Height))*\\ (LayerID1, Output Size(Height))\\ = (3*3*3)* 64 *224*224\\ =1728 * 224*224 \\ = 86,704,128 Conv1_1=[(LayerID1,PatchSize(Height))∗(LayerID1,PatchSize(Width))∗(LayerID0,OutputSize(Channel))]∗(LayerID1,OutputSize(Height))∗(LayerID1,OutputSize(Height))=(3∗3∗3)∗64∗224∗224=1728∗224∗224=86,704,128 -
LayerID12卷积核计算
参数量
C o n v 4 _ 1 = [ ( L a y e r I D 12 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 11 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 12 , O u t p u t S i z e ( C h a n n e l ) ) = ( 3 ∗ 3 ∗ 256 ) ∗ 512 = 1 , 179 , 648 Conv4\_1 =\\ [ (LayerID12, Patch Size(Height)) *\\ (LayerID12, Patch Size(Width))* \\ (LayerID11, Output Size(Channel)) ]*\\ (LayerID12, Output Size(Channel))\\ = (3*3*256)* 512\\ =1,179,648 Conv4_1=[(LayerID12,PatchSize(Height))∗(LayerID12,PatchSize(Width))∗(LayerID11,OutputSize(Channel))]∗(LayerID12,OutputSize(Channel))=(3∗3∗256)∗512=1,179,648
FLOPs计算
C o n v 4 _ 1 = [ ( L a y e r I D 12 , P a t c h S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , P a t c h S i z e ( W i d t h ) ) ∗ ( L a y e r I D 11 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ ( L a y e r I D 12 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 12 , O u t p u t S i z e ( H e i g h t ) ) = ( 3 ∗ 3 ∗ 256 ) ∗ 512 ∗ 28 ∗ 28 = 1 , 179 , 648 ∗ 28 ∗ 28 = 924 , 844 , 032 Conv4\_1 =\\ [ (LayerID12, Patch Size(Height)) *\\ (LayerID12, Patch Size(Width))* \\ (LayerID11, Output Size(Channel)) ]*\\ (LayerID12, Output Size(Height))*\\ (LayerID12, Output Size(Height))\\ = (3*3*256)* 512 *28*28\\ =1,179,648 * 28*28\\ = 924,844,032 Conv4_1=[(LayerID12,PatchSize(Height))∗(LayerID12,PatchSize(Width))∗(LayerID11,OutputSize(Channel))]∗(LayerID12,OutputSize(Height))∗(LayerID12,OutputSize(Height))=(3∗3∗256)∗512∗28∗28=1,179,648∗28∗28=924,844,032 -
LayerID22全连接层计算
参数量
F C 1 = [ ( L a y e r I D 21 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 7 ∗ 7 ∗ 512 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = 102 , 760 , 448 FC1 =\\ [ (LayerID21, Output Size(Height)) *\\ (LayerID21, Output Size(Width))* \\ (LayerID21, Output Size(Channel)) ]*\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ] \\ = [(7*7*512) ]* [1*1*4096]\\ =102,760,448 FC1=[(LayerID21,OutputSize(Height))∗(LayerID21,OutputSize(Width))∗(LayerID21,OutputSize(Channel))]∗[(LayerID22,OutputSize(Height))∗(LayerID22,OutputSize(Width))∗(LayerID22,OutputSize(Channel))]=[(7∗7∗512)]∗[1∗1∗4096]=102,760,448
FLOPs计算
F C 1 = [ ( L a y e r I D 21 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 21 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 7 ∗ 7 ∗ 512 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = 102 , 760 , 448 FC1 =\\ [ (LayerID21, Output Size(Height)) *\\ (LayerID21, Output Size(Width))* \\ (LayerID21, Output Size(Channel)) ]*\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]\\ = [(7*7*512) ]* [1*1*4096]\\ =102,760,448 FC1=[(LayerID21,OutputSize(Height))∗(LayerID21,OutputSize(Width))∗(LayerID21,OutputSize(Channel))]∗[(LayerID22,OutputSize(Height))∗(LayerID22,OutputSize(Width))∗(LayerID22,OutputSize(Channel))]=[(7∗7∗512)]∗[1∗1∗4096]=102,760,448 -
LayerID23全连接层计算
参数量
F C 1 = [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 23 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 1 ∗ 1 ∗ 4096 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = 16 , 777 , 216 FC1 =\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]*\\ [ (LayerID23, Output Size(Height)) *\\ (LayerID23, Output Size(Width))* \\ (LayerID23, Output Size(Channel)) ]\\ = [(1*1*4096) ]* [1*1*4096]\\ =16,777,216 FC1=[(LayerID22,OutputSize(Height))∗(LayerID22,OutputSize(Width))∗(LayerID22,OutputSize(Channel))]∗[(LayerID23,OutputSize(Height))∗(LayerID23,OutputSize(Width))∗(LayerID23,OutputSize(Channel))]=[(1∗1∗4096)]∗[1∗1∗4096]=16,777,216
FLOPs计算
F C 1 = [ ( L a y e r I D 22 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 22 , O u t p u t S i z e ( C h a n n e l ) ) ] ∗ [ ( L a y e r I D 23 , O u t p u t S i z e ( H e i g h t ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( W i d t h ) ) ∗ ( L a y e r I D 23 , O u t p u t S i z e ( C h a n n e l ) ) ] = [ ( 1 ∗ 1 ∗ 4096 ) ] ∗ [ 1 ∗ 1 ∗ 4096 ] = 16 , 777 , 216 FC1 =\\ [ (LayerID22, Output Size(Height)) *\\ (LayerID22, Output Size(Width))* \\ (LayerID22, Output Size(Channel)) ]*\\ [ (LayerID23, Output Size(Height)) *\\ (LayerID23, Output Size(Width))* \\ (LayerID23, Output Size(Channel)) ]\\ = [(1*1*4096) ]* [1*1*4096]\\ =16,777,216 FC1=[(LayerID22,OutputSize(Height))∗(LayerID22,OutputSize(Width))∗(LayerID22,OutputSize(Channel))]∗[(LayerID23,OutputSize(Height))∗(LayerID23,OutputSize(Width))∗(LayerID23,OutputSize(Channel))]=[(1∗1∗4096)]∗[1∗1∗4096]=16,777,216

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。
更多推荐
所有评论(0)