状态:到目前为止,最佳答案的程序在原始程序的33%的时间内执行!但可能还有其他方法可以优化它.
Lua目前是最快的脚本语言,但Lua在针对C/C++的一些基准测试中得分非常糟糕.
其中之一是mandelbrot测试(Generate Mandelbrot设置便携式位图文件N = 16,000),其中得分可怕1:109(多核)或1:28(单核)
由于Delta的速度非常大,因此这是优化的理想选择.此外,我确信那些知道Mike Pall是谁的人可能认为不可能进一步优化这一点,但这显然是错误的.任何做过优化的人都知道总是可以做得更好.此外我通过一些调整设法获得了一些额外的性能,所以我知道它可能:)
-- The Computer Language Shootout -- http://shootout.alioth.debian.org/ -- contributed by Mike Pall local width = tonumber(arg and arg[1]) or 100 local height, wscale = width, 2/width local m, limit2 = 50, 4.0 local write, char = io.write, string.char write("P4\n", width, " ", height, "\n") for y=0,height-1 do local Ci = 2*y / height - 1 for xb=0,width-1,8 do local bits = 0 local xbb = xb+7 for x=xb,xbb < width and xbb or width-1 do bits = bits + bits local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0 local Cr = x * wscale - 1.5 for i=1,m do local Zri = Zr*Zi Zr = Zrq - Ziq + Cr Zi = Zri + Zri + Ci Zrq = Zr*Zr Ziq = Zi*Zi if Zrq + Ziq > limit2 then bits = bits + 1 break end end end if xbb >= width then for x=width,xbb do bits = bits + bits + 1 end end write(char(255-bits)) end end
那么如何对其进行优化(当然,与任何优化一样,您必须测量实施以确保更快).并且你不允许为此改变Lua的C核心,或者使用LuaJit,它可以找到优化Lua弱点之一的方法.
编辑:给予奖励以使挑战更有趣.
通过2,比我以前更好(在我的机器上)约30%.主要的节省来自展开内部循环以分摊开销.
还包括(注释掉)是当你卡在中央心形中时,通过提前退出(并将像素设置为黑色)来节省时间的中止尝试.它有效,但无论我如何摇晃它都会慢一些.
我必须跑,但我会留下一个离别的建议.可能通过对结果进行行程编码来进行一些优化(因此,不是保存一堆bit-twiddled字符,而是保存列表(白点数,黑点数,白点数等). ).这个会:
减少存储/ GC开销
允许对输出生成进行一些优化(当数字为>> 8时)
允许一些轨道检测.
不知道它是否可以被编码得足够紧密以便飞行,但如果我有更多时间,那就是我会尝试下一步的地方.
-- The Computer Language Shootout -- http://shootout.alioth.debian.org/ -- contributed by Mike Pall -- with optimizations by Markus J. Q. (MarkusQ) Roberts local width = tonumber(arg and arg[1]) or 100 local height, wscale = width, 2/width local m, limit2 = 50, 4.0 local write, char = io.write, string.char local h2 = math.floor(height/2) local hm = height - h2*2 local top_half = {} for y=0,h2+hm do local Ci = 2*y / height - 1 local line = {""} for xb=0,width-1,8 do local bits = 0 local xbb = xb+7 for x=xb,xbb < width and xbb or width-1 do bits = bits + bits local Zr, Zi, Zrq, Ziq = 0.0, 0.0, 0.0, 0.0 local Cr = x * wscale - 1.5 local Zri = Zr*Zi for i=1,m/5 do Zr = Zrq - Ziq + Cr Zi = Zri + Zri + Ci Zri = Zr*Zi Zr = Zr*Zr - Zi*Zi + Cr Zi = 2*Zri + Ci Zri = Zr*Zi Zr = Zr*Zr - Zi*Zi + Cr Zi = 2*Zri + Ci Zri = Zr*Zi Zr = Zr*Zr - Zi*Zi + Cr Zi = 2*Zri + Ci Zri = Zr*Zi Zr = Zr*Zr - Zi*Zi + Cr Zi = 2*Zri + Ci Zri = Zr*Zi Zrq = Zr*Zr Ziq = Zi*Zi Zri = Zr*Zi if Zrq + Ziq > limit2 then bits = bits + 1 break end -- if i == 1 then -- local ar,ai = 1-4*Zr,-4*Zi -- local a_r = math.sqrt(ar*ar+ai*ai) -- local k = math.sqrt(2)/2 -- local br,bi2 = math.sqrt(a_r+ar)*k,(a_r-ar)/2 -- if (br+1)*(br+1) + bi2 < 1 then -- break -- end -- end end end for x=width,xbb do bits = bits + bits + 1 end table.insert(line,char(255-bits)) end line = table.concat(line) table.insert(top_half,line) end write("P4\n", width, " ", height, "\n") for y=1,h2+hm do write(top_half[y]) end for y=h2,1,-1 do write(top_half[y]) end