int型配列要素数100万の合計を100回求める処理時間測定
まとめ
forかforeachが速い
LINQは遅いけど今どきのCPUなら10億回の計算でも0.5秒で終わる
VectorのAddはforより速いこともあるけど条件がめんどくさい
2020/02/11追記ここから
gogowaten.hatenablog.com
2020/02/11追記ここまで
環境
CPU AMD Ryzen 5 2400G(4コア8スレッド)
MEM DDR4-2666
Window 10 Home 64bit
Visual Studio 2019 Community .NET Core 3.1 WPF C#
左にあるボタンで計測開始、終わるまで待つ、途中でキャンセルできない
今回のアプリダウンロード
20200205_int配列の値の合計.zip
ギットハブ
github.com
値は1からの連番の値が入っているので合計値は
detail.chiebukuro.yahoo.co.jp
こういう方法があるんだなあ
(1+1000000)*(1000000-1+1)/2=5.000005e+11
で
500000500000が正解
計算方法は大きく分けて3つ
速かったのは普通のforやforeachで0.055秒
LINQは遅くてforやforeachの10倍かかって0.55秒
VectorのAddはオーバーフローしないようにlong型配列に変換していて、この変換方法の違いで0.042~2秒と大きく違いが出た
System.Numerics.VectorのAdd
2つのVector<int>を渡すと、それぞれのペアの合計値のVector<int>を返す。
int型配列からVector<int>を作ってVector.Addで足し算
va = {10, 11, 12, 13, 14, 15, 16, 17}
vb = {10, 10, 10, 10, 10, 10, 10, 10}
vv = Vector.Add(va, vb)
vv = {20, 21, 22, 23, 24, 25, 26, 27}
問題ない、あとはvvの各要素をforで足し算すればいい
オーバーフロー
55行目を変更、int型の最大値の2147483647に変更して足し算
結果
va {<10, 11, 12, 13, 14, 15, 16, 17>}
vb {<2147483647, 2147483647, 2147483647, 2147483647, 2147483647, 2147483647, 2147483647, 2147483647>}
vv = Vector.Add(va, vb)
vv {<-2147483639, -2147483638, -2147483637, -2147483636, -2147483635, -2147483634, -2147483633, -2147483632>}
オーバーフローしてマイナスの値になってしまったので
Vector<long>を使って計算すればいいんだけど
int型配列からVector<long>は作れないみたいなので
long型配列に変換するというか、新たに作る
ArrayのCopyToメソッドでint型配列をlong型配列にコピーして、それを使ってVector<long>を作る
これでようやく計算できた
今の環境だとVector<long>を使って一度に計算できる個数は4つまでだった、Vector<int>なら8個なので計算速度も半分になっているのかも、それでもそんなことは些細なことになってしまうくらい、intからlongへの変換の処理が重たい
forとforeach
普通に速い
LINQのSum
long型へのキャストが必要で、この書き方がわからなくてググった
www.atmarkit.co.jp
たった1行で済む!処理速度はforの10倍かかるけど普通なら十分な速度
ここからVectorのAdd
int型配列を元に一括でlong型へ変換して、そこから4個づつ取り出してVector<long>を作成している(98と103行目)
結果は0.47秒とLINQと同じくらい遅い、しかもメモリをたくさん使う。これは一括でlong型配列を作っているからみたいで、これ以外の方法では43BMだったのが、70~170MBくらいになる。
一括じゃなくて計算する分だけをその都度変換するようにしたのが
これでメモリ使用量も普通になって処理速度も3倍くらい速くなった、とはいってもforに比べると4倍くらい遅い
Vector<long>.Count決め打ち
Vector<long>.Countの値は今の環境では4だからこれを決め打ちして、forの数を減らしてみたけど、誤差程度に速くなった?
long型配列に変換する部分をLINQのSkipとTakeを使って取り出す方法
遅い、LINQで取り出してToArray()してCopyTo()するから、forに比べると100倍も時間かかる
変換する部分をArraySegmentで取り出す
test8
test9
LINQよりは2倍速くなったけどねえ…Sliceメソッドを使ったほうが短くて済んだ
Span
SpanってのからもVectorが作れるようなので使ってみたら
Spanで取り出し
ArraySegmentよりは速くなったけど、それでもやっぱり遅い。使い方とか使いみちが違うんだろうねえ
Buffer.BlockCopyで取り出し
Spanよりさらに速くなった?けどまだ遅い、遅いけどこれで終わり
ってことでVectorのAddはいまいち
オーバーフローすることがない条件下ならどれくらいの速度なのか
long型配列にint型で収まる値が入っていて、全てを足し算してもlong型に収まるような状況、これなら変換する必要がなくlong型配列からVector<long>を作成できる
結果
test14は0.04秒台、forでは0.05秒台なので2割位速い
test14のコード
Vector<long>のAddは一度に4つの足し算ができるから4倍速いハズ?だけど、それ意外の処理が遅いのかしらねえ
番外編
int型配列をVector<int>でAdd
forより5倍以上も速い!int型なら8個同時に足し算だから、流石に速くなった。Overflowすることがない条件下ならVectorのAddがいいねえ
日記
私、平均値(合計値)はSIMD(Vector.Add)でって言ったよね!
前回のVector.MinやMaxがかなり速かったから、Addも速くなると思って試したんだけどいまいちだったねえ。Add自体は速いんだけどお膳立てとか条件が限られるとかで本来の速さが発揮できない感じなので合計値(平均値)は普通にforで計算するかなあ
参照したところ
dobon.net
ArraySegmentとBuffer.BlockCopyの使い方はこちらから
XAML
<Window xClass="_20200205_int配列の値の合計.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlnsx="http://schemas.microsoft.com/winfx/2006/xaml"
xmlnsd="http://schemas.microsoft.com/expression/blend/2008"
xmlnsmc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlnslocal="clr-namespace:_20200205_int配列の値の合計"
mcIgnorable="d"
Title="MainWindow" Height="480" Width="600">
<Grid>
<StackPanel>
<StackPanelResources>
<Style TargetType="StackPanel">
<Setter Property="Margin" Value="2"/>
</Style>
<Style TargetType="Button">
<Setter Property="Width" Value="60"/>
</Style>
<Style TargetType="TextBlock">
<Setter Property="Margin" Value="2,0"/>
</Style>
</StackPanelResources>
<TextBlock xName="MyTextBlock" Text="text" HorizontalAlignment="Center"/>
<TextBlock xName="MyTextBlockVectorCount" Text="vectorCount" HorizontalAlignment="Center"/>
<StackPanel Orientation="Horizontal">
<Button xName="Button1" Content="test1"/>
<TextBlock xName="Tb1" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button2" Content="test2"/>
<TextBlock xName="Tb2" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button3" Content="test3"/>
<TextBlock xName="Tb3" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button4" Content="test4"/>
<TextBlock xName="Tb4" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button5" Content="test5"/>
<TextBlock xName="Tb5" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button6" Content="test6"/>
<TextBlock xName="Tb6" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button7" Content="test7"/>
<TextBlock xName="Tb7" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button8" Content="test8"/>
<TextBlock xName="Tb8" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button9" Content="test9"/>
<TextBlock xName="Tb9" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button10" Content="test10"/>
<TextBlock xName="Tb10" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button11" Content="test11"/>
<TextBlock xName="Tb11" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button12" Content="test12"/>
<TextBlock xName="Tb12" Text="time"/>
</StackPanel>
<Border Height="1" Background="Orange" UseLayoutRounding="True"/>
<StackPanel Orientation="Horizontal">
<Button xName="Button13" Content="test13"/>
<TextBlock xName="Tb13" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button14" Content="test14"/>
<TextBlock xName="Tb14" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button15" Content="test15"/>
<TextBlock xName="Tb15" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button16" Content="test16"/>
<TextBlock xName="Tb16" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button17" Content="test17"/>
<TextBlock xName="Tb17" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button18" Content="test18"/>
<TextBlock xName="Tb18" Text="time"/>
</StackPanel>
<StackPanel Orientation="Horizontal">
<Button xName="Button19" Content="test19"/>
<TextBlock xName="Tb19" Text="time"/>
</StackPanel>
</StackPanel>
</Grid>
</Window>
using System;
using System.Linq;
using System.Windows;
using System.Windows.Controls;
using System.Diagnostics;
using System.Numerics;
namespace _20200205_int配列の値の合計
{
<summary>
</summary>
public partial class MainWindow : Window
{
private int[] MyIntAry;
private long[] MyLongAry;
private const int LOOP_COUNT = 100;
private const int ELEMENT_COUNT = 1_000_000;
public MainWindow()
{
InitializeComponent();
MyTextBlock.Text = $"配列の値の合計、要素数{ELEMENT_COUNT.ToString("N0")}の合計を{LOOP_COUNT}回求める処理時間";
MyTextBlockVectorCount.Text = $"Vector<long>.Count = {Vector<long>.Count}";
MyIntAry = Enumerable.Range(1, ELEMENT_COUNT).ToArray();
MyLongAry = new long[MyIntAry.Length];
MyIntAry.CopyTo(MyLongAry, 0);
Button1.Click += (s, e) => MyExe(TestFor, Tb1);
Button2.Click += (s, e) => MyExe(TestForeach, Tb2);
Button3.Click += (s, e) => MyExe(TestLinqSum, Tb3);
Button4.Click += (s, e) => MyExe(TestVectorAddCopyToAll, Tb4);
Button5.Click += (s, e) => MyExe(TestVectorAddEach, Tb5);
Button6.Click += (s, e) => MyExe(TestVectorAddEachCount4, Tb6);
Button7.Click += (s, e) => MyExe(TestVectorAddEachCopyToLinq, Tb7);
Button8.Click += (s, e) => MyExe(TestVectorAddEachCopyToArraySegment, Tb8);
Button9.Click += (s, e) => MyExe(TestVectorAddEachCopyToArraySegmentSlice, Tb9);
Button10.Click += (s, e) => MyExe(TestVectorAddEachCopyToSpan, Tb10);
Button11.Click += (s, e) => MyExe(TestVectorAddEachCopyToSpan2, Tb11);
Button12.Click += (s, e) => MyExe(TestVectorAddEachCopyToBlockCopy, Tb12);
Button13.Click += (s, e) => MyExe(TestLongLinqSum, Tb13);
Button14.Click += (s, e) => MyExe(TestLongVectorAdd, Tb14);
Button15.Click += (s, e) => MyExe(TestLongVectorAddSpan, Tb15);
Button16.Click += (s, e) => MyExe(TestLongVectorAddReadOnlySpan, Tb16);
Button17.Click += (s, e) => MyExe(TestLongVectorAdd2, Tb17);
Button18.Click += (s, e) => MyExe(TestLongVectorAddOverflow, Tb18);
}
private long TestFor(int[] Ary)
{
long total = 0;
for (int i = 0; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestForeach(int[] Ary)
{
long total = 0;
foreach (var item in Ary)
{
total += item;
}
return total;
}
private long TestLinqSum(int[] Ary)
{
return Ary.Sum(i => (long)i);
}
private long TestVectorAddCopyToAll(int[] Ary)
{
var longAry = new long[Ary.Length];
Ary.CopyTo(longAry, 0);
int simdLength = Vector<long>.Count;
var v = new Vector<long>(longAry);
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, i));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEach(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
var longAry = new long[simdLength];
for (int i = 0; i < simdLength; i++)
{
longAry[i] = Ary[i];
}
var v = new Vector<long>(longAry);
for (int j = simdLength; j < lastIndex; j += simdLength)
{
for (int i = 0; i < simdLength; i++)
{
longAry[i] = Ary[j + i];
}
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEachCount4(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % 4);
var longAry = new long[4];
longAry[0] = Ary[0];
longAry[1] = Ary[1];
longAry[2] = Ary[2];
longAry[3] = Ary[3];
var v = new Vector<long>(longAry);
for (int i = 4; i < lastIndex; i += simdLength)
{
longAry[0] = Ary[i];
longAry[1] = Ary[i + 1];
longAry[2] = Ary[i + 2];
longAry[3] = Ary[i + 3];
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry));
}
long total = 0;
for (int i = 0; i < 4; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEachCopyToLinq(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
Ary.Take(simdLength).ToArray().CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
Ary.Skip(i).Take(simdLength).ToArray().CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEachCopyToArraySegment(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
new ArraySegment<int>(Ary, 0, simdLength).ToArray().CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
new ArraySegment<int>(Ary, i, simdLength).ToArray().CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEachCopyToArraySegmentSlice(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
var segment = new ArraySegment<int>(Ary);
segment.Slice(0, simdLength).ToArray().CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
segment.Slice(i, simdLength).ToArray().CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestVectorAddEachCopyToSpan(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
new Span<int>(Ary).Slice(0, simdLength).ToArray().CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
new Span<int>(Ary).Slice(i, simdLength).ToArray().CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++) total += v[i];
for (int i = lastIndex; i < Ary.Length; i++) total += Ary[i];
return total;
}
private long TestVectorAddEachCopyToSpan2(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
var sp = new Span<int>(Ary);
sp.Slice(0, simdLength).ToArray().CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
sp.Slice(i, simdLength).ToArray().CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++) total += v[i];
for (int i = lastIndex; i < Ary.Length; i++) total += Ary[i];
return total;
}
private long TestVectorAddEachCopyToBlockCopy(int[] Ary)
{
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
long[] longAry = new long[simdLength];
int[] iAry = new int[simdLength];
int typeSize = System.Runtime.InteropServices.Marshal.SizeOf(Ary.GetType().GetElementType());
Buffer.BlockCopy(Ary, 0, iAry, 0, simdLength * typeSize);
iAry.CopyTo(longAry, 0);
var v = new Vector<long>(longAry);
int blockSize = simdLength * typeSize;
for (int i = simdLength; i < lastIndex; i += simdLength)
{
Buffer.BlockCopy(Ary, i * typeSize, iAry, 0, blockSize);
iAry.CopyTo(longAry, 0);
v = System.Numerics.Vector.Add(v, new Vector<long>(longAry, 0));
}
long total = 0;
for (int i = 0; i < simdLength; i++) total += v[i];
for (int i = lastIndex; i < Ary.Length; i++) total += Ary[i];
return total;
}
#region long型配列
private long TestLongLinqSum(long[] Ary)
{
return Ary.Sum();
}
private long TestLongVectorAdd(long[] Ary)
{
var v = new Vector<long>(Ary);
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<long>(Ary, i));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestLongVectorAddSpan(long[] Ary)
{
var sp = new Span<long>(MyLongAry);
var v = new Vector<long>(sp);
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<long>(sp.Slice(i, simdLength)));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestLongVectorAddReadOnlySpan(long[] Ary)
{
var sp = new ReadOnlySpan<long>(MyLongAry);
var v = new Vector<long>(sp);
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<long>(sp.Slice(i, simdLength)));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private long TestLongVectorAdd2(long[] Ary)
{
var v = new Vector<long>(Ary);
int simdLength = Vector<long>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<long>(Ary[i..(i + simdLength)]));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
#endregion
private long TestLongVectorAddOverflow(int[] Ary)
{
var v = new Vector<int>(Ary);
int simdLength = Vector<int>.Count;
int lastIndex = Ary.Length - (Ary.Length % simdLength);
for (int i = simdLength; i < lastIndex; i += simdLength)
{
v = System.Numerics.Vector.Add(v, new Vector<int>(Ary, i));
}
long total = 0;
for (int i = 0; i < simdLength; i++)
{
total += v[i];
}
for (int i = lastIndex; i < Ary.Length; i++)
{
total += Ary[i];
}
return total;
}
private void MyExe(Func<int[], long> func, TextBlock tb)
{
long total = 0;
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < LOOP_COUNT; i++)
{
total = func(MyIntAry);
}
sw.Stop();
tb.Text = $"処理時間:{sw.Elapsed.TotalSeconds.ToString("00.000")}秒 合計値:{total} {System.Reflection.RuntimeReflectionExtensions.GetMethodInfo(func).Name}";
}
private void MyExe(Func<long[], long> func, TextBlock tb)
{
long total = 0;
var sw = new Stopwatch();
sw.Start();
for (int i = 0; i < LOOP_COUNT; i++)
{
total = func(MyLongAry);
}
sw.Stop();
tb.Text = $"処理時間:{sw.Elapsed.TotalSeconds.ToString("00.000")}秒 合計値:{total} {System.Reflection.RuntimeReflectionExtensions.GetMethodInfo(func).Name}";
}
}
}
.NET Frameworkだとそのままでは動かないかも、nugetでSystem.Numerics.Vectorsを追加して参照に追加する必要がある
関連記事
次の次は5日後、速くなった
gogowaten.hatenablog.com
次回は2日後
10日前
gogowaten.hatenablog.com
gogowaten.hatenablog.com