我想计算一组循环数据的平均值.例如,我可能会从阅读指南中获得几个样本.问题当然是如何处理环绕.相同的算法可能对时钟表有用.
实际问题更复杂 - 统计在球体上或在"包裹"的代数空间中意味着什么,例如添加剂组mod n.答案可能不是唯一的,例如359度和1度的平均值可能是0度或180度,但统计上0看起来更好.
这对我来说是一个真正的编程问题,我试图让它看起来不像是一个数学问题.
从角度计算单位矢量并取其平均值的角度.
这个问题在书中详细研究:"球体统计",阿肯色大学Geoffrey S. Watson,数学科学讲义,1983 John Wiley&Sons,Inc.,http://catless.ncl. Bruce Karsh的ac.uk/Risks/7.44.html#subj4.
从一组角度测量a [i] 0 <= i估计平均角度A的好方法
sum_i_from_1_to_N sin(a[i]) a = arctangent --------------------------- sum_i_from_1_to_N cos(a[i])
starblue给出的方法在计算上是等价的,但他的理由更清晰,可能在程序上更有效率,并且在零情况下也能很好地工作,所以对他有好感.
现在,在维基百科上更详细地探讨该主题,以及其他用途,如小数部分.
我看到了问题 - 例如,如果你有45'角和315'角,"自然"平均值将是180',但你想要的值实际上是0'.
我认为Starblue正在做点什么.只需计算每个角度的(x,y)笛卡尔坐标,并将这些结果矢量相加.最终矢量的角度偏移应该是您所需的结果.
x = y = 0 foreach angle { x += cos(angle) y += sin(angle) } average_angle = atan2(y, x)
我现在忽略了罗盘标题从北方开始,顺时针方向,而"正常"笛卡尔坐标沿X轴从零开始,然后逆时针方向.无论如何,数学应该以相同的方式运作.
对于两个角度的特殊情况:
答案((a + b)mod 360)/ 2是错误的.对于角度350和2,最近点是356,而不是176.
单位矢量和trig解决方案可能太昂贵.
我从一点点的修补中得到的是:
diff = ( ( a - b + 180 + 360 ) mod 360 ) - 180 angle = (360 + b + ( diff / 2 ) ) mod 360
0,180 - > 90(两个答案:这个等式从a顺时针回答)
180,0 - > 270(见上文)
180,1 - > 90.5
1,180 - > 90.5
20,350 - > 5
350,20 - > 5(以下所有示例也正确反转)
10,20 - > 15
350,2 - > 356
359,0 - > 359.5
180,180 - > 180
ackb是正确的,这些基于矢量的解决方案不能被认为是真正的角度平均值,它们只是单位矢量对应物的平均值.但是,ackb的建议解决方案似乎没有数学上的声音.
以下是从最小化(angle [i] - avgAngle)^ 2(其中必要时校正差异)的目标数学推导出的解决方案,这使其成为角度的真实算术平均值.
首先,我们需要确切地考虑哪些情况下角度之间的差异与它们的正常数字对应物之间的差异不同.考虑角度x和y,如果y> = x - 180且y <= x + 180,那么我们可以直接使用差值(xy).否则,如果不满足第一个条件,那么我们必须在计算中使用(y + 360)而不是y.相应的,如果不满足第二个条件,那么我们必须使用(y-360)而不是y.由于曲线方程我们只是最小化这些不等式从真变为假的点的变化,反之亦然,我们可以将整个[0,360]范围分成由这些点分开的一组段.然后,我们只需要找到每个段的最小值,然后找到每个段的最小值,即平均值.
这是一张图像,展示了计算角度差异时出现问题的位置.如果x位于灰色区域,则会出现问题.
为了最小化变量,根据曲线,我们可以得到我们想要最小化的导数,然后我们找到转折点(这是导数= 0的地方).
在这里,我们将应用最小化平方差的概念来推导出通用的算术平均公式:sum(a [i])/ n.曲线y = sum((a [i] -x)^ 2)可以通过这种方式最小化:
y = sum((a[i]-x)^2) = sum(a[i]^2 - 2*a[i]*x + x^2) = sum(a[i]^2) - 2*x*sum(a[i]) + n*x^2 dy\dx = -2*sum(a[i]) + 2*n*x for dy/dx = 0: -2*sum(a[i]) + 2*n*x = 0 -> n*x = sum(a[i]) -> x = sum(a[i])/n
现在将它应用于具有调整后差异的曲线:
b = a的子集,其中正确(角度)差异a [i] -xc = a的子集,其中正确(角度)差异(a [i] -360)-x cn = cd的大小= a的子集正确(角度)差(a [i] +360)-x dn = d的大小
y = sum((b[i]-x)^2) + sum(((c[i]-360)-b)^2) + sum(((d[i]+360)-c)^2) = sum(b[i]^2 - 2*b[i]*x + x^2) + sum((c[i]-360)^2 - 2*(c[i]-360)*x + x^2) + sum((d[i]+360)^2 - 2*(d[i]+360)*x + x^2) = sum(b[i]^2) - 2*x*sum(b[i]) + sum((c[i]-360)^2) - 2*x*(sum(c[i]) - 360*cn) + sum((d[i]+360)^2) - 2*x*(sum(d[i]) + 360*dn) + n*x^2 = sum(b[i]^2) + sum((c[i]-360)^2) + sum((d[i]+360)^2) - 2*x*(sum(b[i]) + sum(c[i]) + sum(d[i])) - 2*x*(360*dn - 360*cn) + n*x^2 = sum(b[i]^2) + sum((c[i]-360)^2) + sum((d[i]+360)^2) - 2*x*sum(x[i]) - 2*x*360*(dn - cn) + n*x^2 dy/dx = 2*n*x - 2*sum(x[i]) - 2*360*(dn - cn) for dy/dx = 0: 2*n*x - 2*sum(x[i]) - 2*360*(dn - cn) = 0 n*x = sum(x[i]) + 360*(dn - cn) x = (sum(x[i]) + 360*(dn - cn))/n
This alone is not quite enough to get the minimum, while it works for normal values, that has an unbounded set, so the result will definitely lie within set's range and is therefore valid. We need the minimum within a range (defined by the segment). If the minimum is less than our segment's lower bound then the minimum of that segment must be at the lower bound (because quadratic curves only have 1 turning point) and if the minimum is greater than our segment's upper bound then the segment's minimum is at the upper bound. After we have the minimum for each segment, we simply find the one that has the lowest value for what we're minimising (sum((b[i]-x)^2) + sum(((c[i]-360)-b)^2) + sum(((d[i]+360)-c)^2)).
这是曲线的图像,显示了它在x =(a [i] +180)%360的点处的变化.有问题的数据集是{65,92,230,320,250}.
以下是Java中算法的实现,包括一些优化,其复杂度为O(nlogn).如果将基于比较的排序替换为基于非比较的排序(例如基数排序),则可以将其简化为O(n).
static double varnc(double _mean, int _n, double _sumX, double _sumSqrX) { return _mean*(_n*_mean - 2*_sumX) + _sumSqrX; } //with lower correction static double varlc(double _mean, int _n, double _sumX, double _sumSqrX, int _nc, double _sumC) { return _mean*(_n*_mean - 2*_sumX) + _sumSqrX + 2*360*_sumC + _nc*(-2*360*_mean + 360*360); } //with upper correction static double varuc(double _mean, int _n, double _sumX, double _sumSqrX, int _nc, double _sumC) { return _mean*(_n*_mean - 2*_sumX) + _sumSqrX - 2*360*_sumC + _nc*(2*360*_mean + 360*360); } static double[] averageAngles(double[] _angles) { double sumAngles; double sumSqrAngles; double[] lowerAngles; double[] upperAngles; { ListlowerAngles_ = new LinkedList (); List upperAngles_ = new LinkedList (); sumAngles = 0; sumSqrAngles = 0; for(double angle : _angles) { sumAngles += angle; sumSqrAngles += angle*angle; if(angle < 180) lowerAngles_.add(angle); else if(angle > 180) upperAngles_.add(angle); } Collections.sort(lowerAngles_); Collections.sort(upperAngles_,Collections.reverseOrder()); lowerAngles = new double[lowerAngles_.size()]; Iterator lowerAnglesIter = lowerAngles_.iterator(); for(int i = 0; i < lowerAngles_.size(); i++) lowerAngles[i] = lowerAnglesIter.next(); upperAngles = new double[upperAngles_.size()]; Iterator upperAnglesIter = upperAngles_.iterator(); for(int i = 0; i < upperAngles_.size(); i++) upperAngles[i] = upperAnglesIter.next(); } List averageAngles = new LinkedList (); averageAngles.add(180d); double variance = varnc(180,_angles.length,sumAngles,sumSqrAngles); double lowerBound = 180; double sumLC = 0; for(int i = 0; i < lowerAngles.length; i++) { //get average for a segment based on minimum double testAverageAngle = (sumAngles + 360*i)/_angles.length; //minimum is outside segment range (therefore not directly relevant) //since it is greater than lowerAngles[i], the minimum for the segment //must lie on the boundary lowerAngles[i] if(testAverageAngle > lowerAngles[i]+180) testAverageAngle = lowerAngles[i]; if(testAverageAngle > lowerBound) { double testVariance = varlc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,i,sumLC); if(testVariance < variance) { averageAngles.clear(); averageAngles.add(testAverageAngle); variance = testVariance; } else if(testVariance == variance) averageAngles.add(testAverageAngle); } lowerBound = lowerAngles[i]; sumLC += lowerAngles[i]; } //Test last segment { //get average for a segment based on minimum double testAverageAngle = (sumAngles + 360*lowerAngles.length)/_angles.length; //minimum is inside segment range //we will test average 0 (360) later if(testAverageAngle < 360 && testAverageAngle > lowerBound) { double testVariance = varlc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,lowerAngles.length,sumLC); if(testVariance < variance) { averageAngles.clear(); averageAngles.add(testAverageAngle); variance = testVariance; } else if(testVariance == variance) averageAngles.add(testAverageAngle); } } double upperBound = 180; double sumUC = 0; for(int i = 0; i < upperAngles.length; i++) { //get average for a segment based on minimum double testAverageAngle = (sumAngles - 360*i)/_angles.length; //minimum is outside segment range (therefore not directly relevant) //since it is greater than lowerAngles[i], the minimum for the segment //must lie on the boundary lowerAngles[i] if(testAverageAngle < upperAngles[i]-180) testAverageAngle = upperAngles[i]; if(testAverageAngle < upperBound) { double testVariance = varuc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,i,sumUC); if(testVariance < variance) { averageAngles.clear(); averageAngles.add(testAverageAngle); variance = testVariance; } else if(testVariance == variance) averageAngles.add(testAverageAngle); } upperBound = upperAngles[i]; sumUC += upperBound; } //Test last segment { //get average for a segment based on minimum double testAverageAngle = (sumAngles - 360*upperAngles.length)/_angles.length; //minimum is inside segment range //we test average 0 (360) now if(testAverageAngle < 0) testAverageAngle = 0; if(testAverageAngle < upperBound) { double testVariance = varuc(testAverageAngle,_angles.length,sumAngles,sumSqrAngles,upperAngles.length,sumUC); if(testVariance < variance) { averageAngles.clear(); averageAngles.add(testAverageAngle); variance = testVariance; } else if(testVariance == variance) averageAngles.add(testAverageAngle); } } double[] averageAngles_ = new double[averageAngles.size()]; Iterator averageAnglesIter = averageAngles.iterator(); for(int i = 0; i < averageAngles_.length; i++) averageAngles_[i] = averageAnglesIter.next(); return averageAngles_; }
一组角度的算术平均值可能与您对平均值应该是什么的直观概念不一致.例如,集合{179,179,0,181,181}的算术平均值是216(和144).您立即想到的答案可能是180,但众所周知,算术平均值受边缘值的影响很大.你还应该记住角度不是矢量,有时候在处理角度时看起来很吸引人.
该算法当然也适用于服从模数运算(具有最小调整)的所有量,例如一天中的时间.
我还要强调,即使这是一个真正的角度平均值,与矢量解决方案不同,这并不一定意味着它是你应该使用的解决方案,相应单位矢量的平均值可能就是你实际的值应该使用.
您必须更准确地定义平均值.对于两个角度的具体情况,我可以想到两种不同的场景:
"真实"平均值,即(a + b)/ 2%360.
在保持同一个半圆的同时指向"两个"之间的角度,例如355和5,这将是0,而不是180.为此,您需要检查两个角度之间的差异是否大于180或不.如果是这样,请在使用上述公式之前将较小的角度增加360.
但是,我不知道第二种替代方案如何针对两个以上角度的情况进行推广.