先附上官方文档说明:
torch.nn.functional — PyTorch 1.13 documentation
torch.nn.functional.
kl_div
(
input
,
target
,
size_average=None
,
reduce=None
,
reduction='mean'
)
Parameters
-
input
– Tensor of arbitrary shape
-
target
– Tensor of the same shape as input
-
size_average
(
bool
,
optional
) – Deprecated (see
reduction
). By default, the losses are averaged over each loss element in the batch. Note that for some losses, there multiple elements per sample. If the field
size_average
is set to
False
, the losses are instead summed for each minibatch. Ignored when reduce is
False
. Default:
True
-
reduce
(
bool
,
optional
) – Deprecated (see
reduction
). By default, the losses are averaged or summed over observations for each minibatch depending on
size_average
. When
reduce
is
False
, returns a loss per batch element instead and ignores
size_average
. Default:
True
-
reduction
(
string
,
optional
) – Specifies the reduction to apply to the output:
'none'
|
'batchmean'
|
'sum'
|
'mean'
.
'none'
: no reduction will be applied
'batchmean'
: the sum of the output will be divided by the batchsize
'sum'
: the output will be summed
'mean'
: the output will be divided by the number of elements in the output Default:
'mean'
然后看看怎么用:第一个参数传入的是一个
对数概率矩阵
,第二个参数传入的是
概率矩阵
。这里很重要,不然求出来的kl散度可能是个
负值
。
比如现在我有两个矩阵X, Y。因为kl散度具有
不对称性
,存在一个指导和被指导的关系,因此这连个矩阵输入的顺序需要确定一下。
举个例子:如果现在想用Y指导X,第一个参数要传X,第二个要传Y。就是被指导的放在前面,然后求相应的概率和对数概率就可以了。
import torch
import torch.nn.functional as F
# 定义两个矩阵
x = torch.randn((4, 5))
y = torch.randn((4, 5))
# 因为要用y指导x,所以求x的对数概率,y的概率
logp_x = F.log_softmax(x, dim=-1)
p_y = F.softmax(y, dim=-1)
kl_sum = F.kl_div(logp_x, p_y, reduction='sum')
kl_mean = F.kl_div(logp_x, p_y, reduction='mean')
print(kl_sum, kl_mean)
>>> tensor(3.4165) tensor(0.1708)
贴一下下面的评论解释一下reduction参数,比如mean是对所有元素求平均,那么最后kl_sum就是kl_mean的4*5=20倍,即3.41是0.17的20倍。
先附上官方文档说明:https://pytorch.org/docs/stable/nn.functional.htmltorch.nn.functional.kl_div(input,target,size_average=None,reduce=None,reduction='mean')Parameters input– Tensor of arbitrary shape target– Tensor of the same shape as input ...
文章目录一、定义1.
KL
距离(Kullback–Leibler
div
ergence)2. torch.nn.
KL
Div
Loss()3. torch.nn.functional.
kl
_
div
()二、实验1. torch.nn.
KL
Div
Loss()2. torch.nn.functional.
kl
_
div
()三、结论1. 实际应用
1.
KL
距离(Kullback–Leibler
div
ergence)
设P, Q为两个概率(意思就是sum为1,相当于做完softmax之后的效果),则
KL
KL散度
计算
公式
KL
(p∣∣q)=∑P(x)log(P(X)Q(x))
KL
(p||q) =\sum{P(x)log(\frac{P(X)}{Q(x)})}
KL
(p∣∣q)=∑P(x)log(Q(x)P(X))
其中P(x)P(x)P(x)是真实的分布,是目标;Q(x)Q(x)Q(x)是拟合分布,是想要改变的分布。
KL散度
值越小,分布越接近。
KL散度
值 ≥\geq≥ 0,当P(x)=Q(x)P(x) = Q(x)P(
ModuleNotFoundError: No module named '
pytorch
_lightning.utilities.cloud_io' 错误是因为你的
代码
中引用了
pytorch
_lightning.utilities.cloud_io模块,但是该模块在
pytorch
_lightning中已经被废弃并移至另外一个包中。你需要单独安装这个包才能使用该模块。你可以通过以下命令来安装这个包:
pip install torchmetrics
conda install -c conda-forge torchmetrics
安装完成后,你应该能够解决这个错误。