日録: 2016

2016年10月14日金曜日

[rebase] windows10のupdateにてMinGWが動かなくなった場合の対処

rebaseを使い、メモリのかちあいを無くした

./rebase -b 0x30000000 /cygdrive/c/MinGW/msys/1.0/bin/msys-1.0.dll

2016年9月21日水曜日

[Excel] 括弧によるマイナス

Excelでは(1)とセルに入力すると-1として判断される。
これはアメリカの経理において負数を()を使って記述するためである。

2016年9月7日水曜日

[ruby] Open3.popen3

標準入出力管理可能な子プロセス作成処理
ただし、バッファに難有

$instructions.each{|insn|
printf("--- TEST: %s", insn)
Tempfile.open(['xxx', '.s']){|fp|
fp.puts("\torg\t100h")
fp.puts("\t" + insn)
fp.puts("\tend")
fp.flush

# system($assembler + " " + fp.path)
Open3.capture3($assembler, fp.path)
hexname = fp.path.sub(%r{\.s$}, ".hex")
File.open(hexname){|hexfile|
$results << TestResult.new(insn.chop, hexfile.read)
}
File.unlink(hexname)
}
}

[ruby] JSON

JSON用クラス
文字列を渡してJSONオブジェクトを作ることができる。

print JSON.load("[#{$results.map{|e| e.to_json}.join(",")}]").to_s

[ruby] Tempfile

テンポラリファイルを作成するためのクラス
open()でテンポラリファイルを作成する。配列でプレフィックス、サフィックスを指定することができる。
ブロックを渡した場合はテンポラリファイルはそのブロック内で有効である。

作られたオブジェクトはpathでファイル名を取り出すことができる。
ブロックから外れた時にはテンポラリファイルは自動で削除される。

$instructions.each{|insn|
printf("--- TEST: %s", insn)
Tempfile.open(['xxx', '.s']){|fp|
fp.puts("\torg\t100h")
fp.puts("\t" + insn)
fp.puts("\tend")
fp.flush

# system($assembler + " " + fp.path)
Open3.capture3($assembler, fp.path)
hexname = fp.path.sub(%r{\.s$}, ".hex")
File.open(hexname){|hexfile|
$results << TestResult.new(insn.chop, hexfile.read)
}
File.unlink(hexname)
}
}

[ruby] Array.flatten

配列の平坦化を行う。
変更を反映するflatten!もある。

[[1,2],3].flatten
→[1,2,3]

[ruby] 文字列フォーマッタ

rubyにはsprintfもあるが、特に必要ない。
%を使うことで同様のことができる。

"\t%d\t%s\n"%[lineno, line]

[ruby] Array.product

配列の直積を作成するメソッド

[a,b].product([1,2])
→ [[a,1],[a,2],[b,1],[b,2]]

2016年7月13日水曜日

[ruby] クラスメソッド

ruby のクラスメソッドはself.をつけて以下のように定義する。

class R
def self.foo
end
end

[Windows10] 互換モード

[オプション]から互換モードを指定することで、前のバージョンのInternet Explorerのプログラムを動かすことが可能。

2016年7月4日月曜日

2016年6月17日金曜日

[ruby] 文字列の末尾削除

文字列の末尾の文字を削除するにはchop もしくは chop!を使用する。
最後が改行で2バイトになる場合、2バイトとも削除する。

2016年5月31日火曜日

[VIsual Basic] ComboBoxのDataSource

DataSourceにオブジェクトの配列を指定する場合、そのオブジェクトにはDisplayName, ValueNameの属性を持つものでなければならない。
無い場合はDataSourceを指定する時にDisplayName, ValueNameが取れないので例外が出る。

2016年5月25日水曜日

2016年5月19日木曜日

[DataGridView]　生成要素

DataGridViewの属性GenerateMemberは論理値である。
これをFalseにすると部品は作るが、外部からアクセスできないようになる。

2016年4月20日水曜日

[git] 日本語表示の乱れ対策

以下のconfig指定で乱れ解消

git config --local core.quotepath false

2016年3月14日月曜日

[Git] git/hg/svn比較

リポジトリの複製 git clone svn checkout hg clone
コミット git commit svn commit hg commit
コミットの詳細確認 git show svn cat
状態確認 git status svn status
差分確認 git diff svn diff hg diff
ログ確認 git log svn log hg log
追加 git add svn add hg add
移動 git mv svn mv hg mv
削除 git rm svn rm hg rm
hg remove
変更取消 git checkout svn revert hg update
git reset
ブランチの作成 git branch svn copy hg branch
ブランチの切替 git checkout svn switch hg update
マージ git merge svn merge hg merge
タグの作成 git tag svn copy
更新 git pull svn update hg pull
git fetch
リモートへの反映 git push svn commit hg push
無視ファイルリスト .gitignore .svnsignature

* リポジトリとはファイル/履歴の記録
gitの場合は、リモートリポジトリ、ローカルリポジトリがある。
そのため、commit, pushのようにそれぞれのリポジトリ用の処理がある。
* hgのBranch
hg branchでブランチを作成する。これはリモートの状態変更が含まれているもので、ローカル、リモートの二つの枝ができることになる。
hg mergeを行うと複数の変更が一つにまとめられる
hg rebaseを行うと、ローカルブランチの変更がリモートブランチの後に加えられ一本化する

[Visual Basic] .NETの属性

ThreadStatic

スレッド内でのみ共通なデータを作成する。

[Visual Basic] Excel処理の終了方法

Dim xlApp As Excel.Application
Dim xlWkb As Excel.WorkBook
Dim xlWsh As Excel.WorkSheet
Set xlApp = CreateObject("Excel.Application")
Set xlWkb = xlApp.WorkBooks.Open(strfullpass)
Set xlWsh = xlWkb.WorkSheets(strsheet)
xlWsh.Cells(introw, intcol) = strsubs
xlWkb.Save
xlWkb.Close
xlApp.Quit

2016年3月8日火曜日

[VBNET] DateTimePIcker

Checked属性

CheckBoxを付けた場合、この属性によりCheckされているかどうか操作可能

Binding

DateTimePickerのCheckBoxを絡めた相互自動変換は行わないので、変換処理を自作する必要が有る。

DateTimePickerFormatterはDateTimePicker→String変換処理
DateTimePickerParserはString→DateTimePicker変換処理

Private Sub SetupDateTimePicker(ByRef dtp As DateTimePicker, ByRef bs As BindingSource, ByVal key As String, ByVal bReadOnly As Boolean)
Dim binding As New Binding("Text", bs, key, True, DataSourceUpdateMode.OnPropertyChanged)
AddHandler binding.Format, AddressOf DateTimePickerFormatter
AddHandler binding.Parse, AddressOf DateTimePickerParser
dtp.DataBindings.Add(binding)
If bReadOnly Then
fncCommon.makeDummyTextBoxFromControl(dtp)
End If
End Sub

[VBNET] ComboBox

ComboBoxのSelectedIndexを-1にすることでコンボボックスの情報を未設定にすることができる。

2016年3月1日火曜日

[Visual Basic] 条件演算

AND/OR条件はAnd/Or演算が使用可能
ただし、左右両辺の演算を行ってしまう。
C/C++同様に左辺から順に演算するのであればAndAlso, OrElseを使用する
C/C++の?:に似たIIFがあるが、これは関数扱いなので、AndAlso, OrElseのような処理は行わない。

2016年2月29日月曜日

[.NET] DataGridViewのイベント

DataRowChanged 行が変更し終わった後に発行
RowValidating Validate処理中に発行。始める前にはこない
RowValidated Validate処理後に発行。

2016年2月18日木曜日

[c,c++] cのコードからc++のコードを呼び出す

extern "C"を使う

extern "C" char *foo()
{
return "HW";
}

この関数はCから呼び出すことが可能になる。

#include
void main()
{
printf("%s\n", foo());
}

ただし、ライブラリ関係が面倒くさい

2016年2月16日火曜日

[.NET] DataTableのソート

DataViewを以下のように初期化する。

Dim dv as DataView = new DataView(table, Nothing, "FL_NAME", CurrentRows)

FL__NAMEが文字列の場合、正しくソートできない場合が有る。
これは、データの比較処理としてtableに設定されている比較関数を使用している為である。

2016年2月10日水曜日

[application verifier] 使用方法

applicatoin verifierはWindowsで実行時のリーク等を確認するためのツール。

以下のように実行すると、ダイアログが出現するので、計測するモジュールを指定する。
後は、測定するモジュールを実行すればデータが取得される。

% appverif

ダイアログのViewでXml形式でのデータを見ることができるが、実際ログは.dat形式にファイルである。

.dat形式のファイルを.xmlに変換するには以下のようにする。

% appverif -logtoxml

2016年2月4日木曜日

[c++] string

整数を文字列に変換する

以前はstringstreamを使っていたが、std::to_stringが使用可能になった

ただし、"%d"フォーマット固定であるので、16進数変換はできない。

2016年1月18日月曜日

[yacc] %left, %right, %prec

中間演算子等の再帰方向を指定すると共に、優先度付も行う。

%left ID_OPERATOR_LOGICAL_OR
%left ID_OPERATOR_LOGICAL_AND
%left '|'
%left '^'
%left '&'
%left ID_OPERATOR_COMPARE_EQ ID_OPERATOR_COMPARE_NE
%left '>' '<' ID_OPERATOR_COMPARE_LE ID_OPERATOR_COMPARE_GE
%left ID_OPERATOR_LEFT_SHIFT ID_OPERATOR_RIGHT_SHIFT
%left '+' '-'
%left '*' '/' '%'
%left UMINUS

上記は中間オペレータと前置オペレータの優先度を表現している。一番下が優先度が一番高い。

また、中間演算子は全て左再帰になる。

primary_expression : identifier { $$ = new VEAccess(VECode::opAccess, $1); }
| '&' identifier { $$ = new VEAccess(VECode::opAddress, $2); }
| '-' expression %prec UMINUS { $$ = new VEUnaryExpression(VECode::opUnaryMinus, $2); }
| constant
| ID_SIZEOF '(' type_declaration ')' { $$ = new VESizeOf($3); }
| ID_SIZEOF '(' identifier ')' { $$ = new VESizeOf(); }
| '{' expressions '}' { $$ = $2; }
| '(' expression ')' { $$ = $2; }

;

また、単項演算子の'-'に関しては%precにより意味付けをする。

[ruby] 標準出力、標準エラー出力

標準出力、標準エラー出力への出力は以下の通り

STDOUT.puts("A")
STDERR.puts("A")

2016年1月14日木曜日

[mercurial] 入門mercurialからのメモ

概要

git同様、分散型リポジトリ

操作

初期化

% hg init
% hg init work

登録

% hg add index.html

コミット

% hg commit

履歴参照

% hg log

取得

% hg update 3

まとめて取り出す

% hg archive -r 3 /tmp/mywork.%h

不要ファイルの削除

登録の必要がないファイルを削除する。

% hg remove

変更の取り消し

hg commit以外の取り消しはrevirtで行う。revirt自体の取り消しは行えない。

% hg revirt

hg commitの取り消しはrollbackで行う

ファイル名付け直し

copyで名前変更

% hg copy

全体の複製

%hg clone

外部との連携

pullは自リポジトリへの反映、pushは外部に反映

%hg pull
%hg push

マージ

複数のバージョンをマージ。

% hg merge

マージの状況を表示

% hg resolve -l

テキストベースでの成果送付

% hg export -o

バイナリベースでの成果送付

% hg bundle
% hg unbundle

[C] C言語BNF

translation_unit : external_decl
| translation_unit external_decl
;
external_decl : function_definition
| decl
;
function_definition : decl_specs declarator decl_list compound_stat
| declarator decl_list compound_stat
| decl_specs declarator compound_stat
| declarator compound_stat
;
decl : decl_specs init_declarator_list ';'
| decl_specs ';'
;
decl_list : decl
| decl_list decl
;
decl_specs : storage_class_spec decl_specs
| storage_class_spec
| type_spec decl_specs
| type_spec
| type_qualifier decl_specs
| type_qualifier
;
storage_class_spec : 'auto' | 'register' | 'static' | 'extern' | 'typedef'
;
type_spec : 'void' | 'char' | 'short' | 'int' | 'long' | 'float'
| 'double' | 'signed' | 'unsigned'
| struct_or_union_spec
| enum_spec
| typedef_name
;
type_qualifier : 'const' | 'volatile'
;
struct_or_union_spec : struct_or_union id '{' struct_decl_list '}'
| struct_or_union '{' struct_decl_list '}'
| struct_or_union id
;
struct_or_union : 'struct' | 'union'
;
struct_decl_list : struct_decl
| struct_decl_list struct_decl
;
init_declarator_list : init_declarator
| init_declarator_list ',' init_declarator
;
init_declarator : declarator
| declarator '=' initializer
;
struct_decl : spec_qualifier_list struct_declarator_list ';'
;
spec_qualifier_list : type_spec spec_qualifier_list
| type_spec
| type_qualifier spec_qualifier_list
| type_qualifier
;
struct_declarator_list : struct_declarator
| struct_declarator_list ',' struct_declarator
;
struct_declarator : declarator
| declarator ':' const_exp
| ':' const_exp
;
enum_spec : 'enum' id '{' enumerator_list '}'
| 'enum' '{' enumerator_list '}'
| 'enum' id
;
enumerator_list : enumerator
| enumerator_list ',' enumerator
;
enumerator : id
| id '=' const_exp
;
declarator : pointer direct_declarator
| direct_declarator
;
direct_declarator : id
| '(' declarator ')'
| direct_declarator '[' const_exp ']'
| direct_declarator '[' ']'
| direct_declarator '(' param_type_list ')'
| direct_declarator '(' id_list ')'
| direct_declarator '(' ')'
;
pointer : '*' type_qualifier_list
| '*'
| '*' type_qualifier_list pointer
| '*' pointer
;
type_qualifier_list : type_qualifier
| type_qualifier_list type_qualifier
;
param_type_list : param_list
| param_list ',' '...'
;
param_list : param_decl
| param_list ',' param_decl
;
param_decl : decl_specs declarator
| decl_specs abstract_declarator
| decl_specs
;
id_list : id
| id_list ',' id
;
initializer : assignment_exp
| '{' initializer_list '}'
| '{' initializer_list ',' '}'
;
initializer_list : initializer
| initializer_list ',' initializer
;
type_name : spec_qualifier_list abstract_declarator
| spec_qualifier_list
;
abstract_declarator : pointer
| pointer direct_abstract_declarator
| direct_abstract_declarator
;
direct_abstract_declarator: '(' abstract_declarator ')'
| direct_abstract_declarator '[' const_exp ']'
| '[' const_exp ']'
| direct_abstract_declarator '[' ']'
| '[' ']'
| direct_abstract_declarator '(' param_type_list ')'
| '(' param_type_list ')'
| direct_abstract_declarator '(' ')'
| '(' ')'
;
typedef_name : id
;
stat : labeled_stat
| exp_stat
| compound_stat
| selection_stat
| iteration_stat
| jump_stat
;
labeled_stat : id ':' stat
| 'case' const_exp ':' stat
| 'default' ':' stat
;
exp_stat : exp ';'
| ';'
;
compound_stat : '{' decl_list stat_list '}'
| '{' stat_list '}'
| '{' decl_list '}'
| '{' '}'
;
stat_list : stat
| stat_list stat
;
selection_stat : 'if' '(' exp ')' stat
| 'if' '(' exp ')' stat 'else' stat
| 'switch' '(' exp ')' stat
;
iteration_stat : 'while' '(' exp ')' stat
| 'do' stat 'while' '(' exp ')' ';'
| 'for' '(' exp ';' exp ';' exp ')' stat
| 'for' '(' exp ';' exp ';' ')' stat
| 'for' '(' exp ';' ';' exp ')' stat
| 'for' '(' exp ';' ';' ')' stat
| 'for' '(' ';' exp ';' exp ')' stat
| 'for' '(' ';' exp ';' ')' stat
| 'for' '(' ';' ';' exp ')' stat
| 'for' '(' ';' ';' ')' stat
;
jump_stat : 'goto' id ';'
| 'continue' ';'
| 'break' ';'
| 'return' exp ';'
| 'return' ';'
;
exp : assignment_exp
| exp ',' assignment_exp
;
assignment_exp : conditional_exp
| unary_exp assignment_operator assignment_exp
;
assignment_operator : '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<='
| '>>=' | '&=' | '^=' | '|='
;
conditional_exp : logical_or_exp
| logical_or_exp '?' exp ':' conditional_exp
;
const_exp : conditional_exp
;
logical_or_exp : logical_and_exp
| logical_or_exp '||' logical_and_exp
;
logical_and_exp : inclusive_or_exp
| logical_and_exp '&&' inclusive_or_exp
;
inclusive_or_exp : exclusive_or_exp
| inclusive_or_exp '|' exclusive_or_exp
;
exclusive_or_exp : and_exp
| exclusive_or_exp '^' and_exp
;
and_exp : equality_exp
| and_exp '&' equality_exp
;
equality_exp : relational_exp
| equality_exp '==' relational_exp
| equality_exp '!=' relational_exp
;
relational_exp : shift_expression
| relational_exp '<' shift_expression
| relational_exp '>' shift_expression
| relational_exp '<=' shift_expression
| relational_exp '>=' shift_expression
;
shift_expression : additive_exp
| shift_expression '<<' additive_exp
| shift_expression '>>' additive_exp
;
additive_exp : mult_exp
| additive_exp '+' mult_exp
| additive_exp '-' mult_exp
;
mult_exp : cast_exp
| mult_exp '*' cast_exp
| mult_exp '/' cast_exp
| mult_exp '%' cast_exp
;
cast_exp : unary_exp
| '(' type_name ')' cast_exp
;
unary_exp : postfix_exp
| '++' unary_exp
| '--' unary_exp
| unary_operator cast_exp
| 'sizeof' unary_exp
| 'sizeof' '(' type_name ')'
;
unary_operator : '&' | '*' | '+' | '-' | '~' | '!'
;
postfix_exp : primary_exp
| postfix_exp '[' exp ']'
| postfix_exp '(' argument_exp_list ')'
| postfix_exp '(' ')'
| postfix_exp '.' id
| postfix_exp '->' id
| postfix_exp '++'
| postfix_exp '--'
;
primary_exp : id
| const
| string
| '(' exp ')'
;
argument_exp_list : assignment_exp
| argument_exp_list ',' assignment_exp
;
const : int_const
| char_const
| float_const
| enumeration_const
;

2016年1月12日火曜日

[Excel, VBA] セレクションの要素操作

セレクトした領域の背景色に応じて値を設定

Selectionを使って選択領域にアクセスできる
Selectionの要素はfor each文により個別アクセス可能
背景食はInterior.Colorでアクセス
&Hffffffは白
.Valueによってセル値のアクセスが可能

Sub テスト欄に〇を付ける()
Dim c As Range

For Each c In Selection
If c.Interior.Color = &HFFFFFF Then
c.Value = "〇"
End If
Next c
End Sub

[ruby, win32ole] rubyからExcel起動

既存のExcelファイルを開く

1. requireでwin32oleを取り込む
2. WIN32OLEをExcel.Applicationで開く。Excel対応
3. Visible = trueはExcelを表示するかどうか
4. Workbooks.Open()で既存のExcelファイルを開く
5. Workbooksの下にはシートが配列形式で格納されている。
6. Workdsheets()内のセルをCell()で取り出す。値はValueで取得
7. Excel.Quit()でExcel終了

require 'win32ole'
excel = WIN32OLE.new('Excel.Application')
excel.Visible = true
workbook = excel.Workbooks.Open('C:\Users\hisa\Canon\V4Drv\doc\design\GPDコンフリクトエンジン評価計画書・評価仕様書.xlsx')
sheet = workbook.Worksheets[3]
title = 6
id = sheet.Cells(title, 2).Value
category = sheet.Cells(title, 3).Value
normal = sheet.Cells(title, 4).Value
detail = sheet.Cells(title, 5).Value
file = sheet.Cells(title, 6).Value
expect = sheet.Cells(title, 7).Value
result = sheet.Cells(title, 8).Value
resultRT = sheet.Cells(title, 9).Value
excel.Quit

2016年1月8日金曜日

[ruby] lambda式

lambda式

rubyにもlambda式がある。

lambda式の評価はcallメソッドを使用する

def getInsn(json, predicate)
json.find{|e| e != nil && predicate.call(e)}
end

定義はlambda+ブロックで行う

def noError(json)
(insn = getInsn(json, lambda{|e| e["Name"] == "GetParseErrors" || e["Name"] == "GetParseError"})) != nil &&
(!insn.include?("Error") || insn["Error"] == [])
end

2016年1月7日木曜日

[gcc, spu] spu版gcc最適化問題(6)

2004年作成

スタックの使用での問題

(a) 使用するレジスタ数が多くなってくるとスタックに一旦ストアしますが、その読み込んだ後のレジスタ効率が悪化します。

lqd $21,32($sp)
absdb $28,$108,$78 lqd $2,32($sp)
lqd $15,160($sp)
shufb $107,$17,$105,$101
absdb $105,$58,$76 stqd $94,1648($sp)
absdb $17,$55,$81 stqd $102,2544($sp)
absdb $81,$50,$81 shufb $94,$7,$35,$101
absdb $40,$21,$107 shufb $102,$31,$12,$101
absdb $33,$2,$11 stqd $162,2560($sp)

$21も$2も同じメモリから読み込んでいるので、別のレジスタに割り当てる必要はありません。
以下のように$21に統合可能です。

lqd $21,32($sp)
absdb $28,$108,$78 lqd $15,160($sp)
shufb $107,$17,$105,$101
absdb $105,$58,$76 stqd $94,1648($sp)
absdb $17,$55,$81 stqd $102,2544($sp)
absdb $81,$50,$81 shufb $94,$7,$35,$101
absdb $40,$21,$107 shufb $102,$31,$12,$101
absdb $33,$21,$11 stqd $162,2560($sp)

(b) ストア-ロード処理ではなくコピーで済ませることができる命令がある。

absdb $124,$71,$116 stqd $6,1856($sp)
absdb $59,$36,$82 stqd $72,1536($sp)
shufb $72,$61,$75,$84
lqd $12,240($sp)
lqd $20,32($sp)
lqd $21,1856($sp) ← (*)
absdb $87,$44,$72 stqd $74,592($sp)
nop shufb $74,$31,$102,$111

(*)のロード命令は以下のようにコピー命令に置換できます。

absdb $124,$71,$116 stqd $6,1856($sp)
absdb $59,$36,$82 stqd $72,1536($sp)
ori $21,$72,0 shufb $72,$61,$75,$84
lqd $12,240($sp)
lqd $20,32($sp)
absdb $87,$44,$72 stqd $74,592($sp)
nop shufb $74,$31,$102,$111

il $43, 0 lqd $56,0($88)
:
stqd $43,112($sp) ← 意味なし
:
ah $119,$120,$12 stqd $63,112($sp)
andi $57,$115,15 lqd $18,112($sp)

(d)一つのレジスタで済ませられるものがある。

stqd $32,4816($sp)
nop lqd $27,4816($sp)
clgth $20,$24,$25 lqd $23,4816($sp)
:
shlqbyi $22,$27,2
shufb $14,$15,$79,$16
selb $17,$23,$22,$20 shlqbyi $19,$26,2

$27,$23共に$32に置き換えることが可能です。

clgth $20,$24,$25
:
shlqbyi $22,$32,2
shufb $14,$15,$79,$16
selb $17,$32,$22,$20 shlqbyi $19,$26,2

[ruby] 配列

sort

配列のsortメソッドにて行う。

ブロックを引数にすることも可能。

-

中間演算子。

左辺の配列要素から右辺の配列要素を削除する

検索

要素の存在を確認する場合はinclude?を使用。

合致する情報の検索を行う場合はfindを使用する。

findはブロックを引数にとることが可能

selectは渡されたブロックの条件が成立した要素の配列を返す

map

lispのmapと同じで配列の各要素にブロックの処理を行い、その結果の配列を返す。

collectも同じ

[ruby] json

準備

require 'json'

構文解析

文字列を引数にしてJSONの構文解析を行う

JSON.parse(open(file, &:read))

返された情報は配列はrubyの配列、オブジェクトは連想配列に変換される。

[ruby] ファイル読み込み

ファイルを変数に読み込む

fileで示されたファイルの内容全てを返す。
この方式だと処理後勝手にファイルをクローズする

open(file, &:read)

2016年1月6日水曜日

[gcc, spu] spu版gcc最適化問題(5)

2004年作成

問題を回避する手段

今まで提出した問題の回避方法について記述します。
番号は[SPUコンパイラ最適化項目]と同じものです。

回避可能な問題に関してのみ記述します。

変数のstore処理の無駄

vector 以外のデータ構造のアクセスではどうしてもメモリからデータを読み込み、ch 系の命令でシャッフル用の情報を作り、シャッフルで値を設定しストアします。
vector型を代わりに使用すれば、回避できます。
既にfor文のindex変数等はやっています。

ループ最適化

vector型を代わりに使用すれば、回避できます。

除算

除数が定数の場合、最適化しないという問題ですが、除算を使用せず、シフト命令等で処理すれば回避できます。

整数演算の順序

複数の加算、乗算を行う場合、乗算を前から順に行うと前の演算の結果を待つため、処理が遅れる場合があります。
処理の順序を括弧で指定することで回避することが可能です。

インライン展開時のスタックポインタ処理

インライン関数ではなくマクロで処理する。

変数のstore処理の無駄

vector型を代わりに使用すれば、回避できます。

シャフル用のベクタに関して

const vec_uchar16 vec = (vec_uchar16)
{
0x0c, 0x0d, 0x0e, 0x0f,
0x0c, 0x0d, 0x0e, 0x0f,
0x0c, 0x0d, 0x0e, 0x0f,
0x0c, 0x0d, 0x0e, 0x0f
}

上記のようなベクタはメモリから読み込まないで、コード上で作成されます。
よって、定義していると無駄なデータ領域になります。

コードでは

ilhu $109, 0x0c0d 上位16ビット指定
...
iohl $109, 0x0e0f 下位16ビットをor

で作成されます。

vec_uchar16 l = { 0x00, 0x01, 0x02, 0x03, 0x08, 0x09, 0x0a, 0x0b,
0x10, 0x11, 0x12, 0x13, 0x18, 0x19, 0x1a, 0x1b };
vec_uchar16 r = { 0x04, 0x05, 0x06, 0x07, 0x0c, 0x0d, 0x0e, 0x0f,
0x14, 0x15, 0x16, 0x17, 0x1c, 0x1d, 0x1e, 0x1f };

のような計算で他のベクタが求まるような情報がある場合、

il $4, 0x0404
add $4, $5, $4

にしても命令は増えますが、サイクル数は同じです。さらに定数部の 16 バイトが削れます。
しかし、現在は定数計算をしてしまうようで、結局同じようなコードになってしまいます。

シャッフル用のベクタテーブルは結構多いようなので、一つのモジュールでまとめた方が良いと思います。

[gcc, spu] spu版gcc最適化問題(４)

2004年作成

ループ最適化

L11で始まるブロックはループの中にあるブロックです。

以下の情報をループの外で設定することで高速化が可能です。
1. 161行目にあるdestのようなシンボルのアドレス
2. 140,142,148,150 にある chd 等で取り出す、レジスタのバイト位置情報。(sp はこのブロック内で更新されません)

132 .L18:
133 brz $23,.L6
134 il $30,0
135 hbra .L17,.L11
136 il $29,0
137 nop $127
138 .L11:
139 ai $28,$sp,32
140 lqd $24,64($sp)
141 ai $4,$29,32
142 chd $25,0($sp)
143 a $7,$29,$28
144 chd $20,2($sp)
145 ai $10,$29,14
146 lqx $22,$4,$sp
147 ai $9,$7,6
148 chd $26,4($sp)
149 ai $8,$7,4
150 chd $28,6($sp)
151 ai $16,$9,14
152 lqd $27,0($9)
153 ai $4,$7,2
154 ai $18,$8,14
155 ai $17,$4,14
156 rotqby $13,$22,$10
157 a $11,$30,$30
158 ori $21,$17,0
159 ori $22,$18,0
160 rotqby $14,$27,$16
161 ila $23,dest
162 lqx $12,$11,$31
:
222 .L17:
223 brz $16,.L11
224 ai $sp,$sp,352

[gcc, spu] spu版gcc最適化問題(３)

2004年作成

インライン展開時のスタックポインタ処理

-O2でコンパイルするとinline定義されたものはinline展開します。
展開するにもかかわらずスタックポインタ操作の処理がそのまま残っています。
inline の場合は元になる関数の部分で全てを済ませればいいのですから、全て無駄なコードになっています。

1 _start:
2 ila $7,66051
3 stqd $17,-48($sp)
4 lqa $4,Test1In+16
:
39 stqd $lr,16($sp) # リターンレジスタも積む必要無し
38 a $7,$5,$4
39 stqd $sp,-112($sp) # スタックポインタのセーブ処理
40 sfh $6,$11,$6
41 lnop
42 ah $2,$2,$14
43 lqd $15,0($7)
44 ila $4,q4x4+16
45 lnop
46 sfi $8,$8,-15
47 shufb $10,$2,$6,$10
48 ai $sp,$sp,-112 # スタック拡張
49 shufb $2,$2,$6,$17
50 lqx $16,$5,$4
51 ila $4,66051
52 ai $sp,$sp,112 # スタック戻し
53 shufb $8,$8,$8,$4
54 sfh $4,$10,$2

レジスタ割り当て

callerレジスタの使用方法で無駄な箇所があります。

:

3 stqd $17,-48($sp) # $17 store
4 lqa $4,Test1In+16
5 lqa $17,.LC0 # $17 def
6 lqa $2,Test1In
:
13 lqa $5,mod_six+20
14 shufb $6,$2,$4,$17 # $17 use
15 shufb $2,$2,$4,$10
:
22 ah $6,$6,$2
23 stqd $16,-32($sp) # $16 store
24 stqd $15,-16($sp)
:
49 shufb $2,$2,$6,$17 #
50 lqx $16,$5,$4 # $16 def
51 ila $4,66051
52 ai $sp,$sp,112
53 shufb $8,$8,$8,$4
54 sfh $4,$10,$2
55 lqd $17,-48($sp) # $17 reload
56 ah $2,$2,$10
:
74 mpya $5,$9,$15,$5
75 mpya $6,$2,$16,$6 # $16 use
76 mpyhha $4,$9,$15
77 lqd $15,-16($sp)
78 mpyhha $7,$2,$16 # $16 use
79 lqd $16,-32($sp) # $16 reload

:

$16と$17が実際は一つのレジスタにまとめることができます。
$17 は 3-55, $16 は 23-79 でライブレンジが重なっているように見えますが、store-reload の部分が重なっているので、これは同一レジスタに割り当てることが可能です。

定数

同じ定数を別のレジスタに割り当てている部分があります。
同じレジスタを使用することによって、レジスタのスピル処理を減らすことができます。

下記の例はMEのものです。定数515に関して一つのレジスタにまとめることが出来ます。

1 SubSampleFi4x4:
2 nop $127
3 stqd $42,-448($sp)
4 il $42,-1152
5 stqd $15,-16($sp)
6 stqd $16,-32($sp)
7 stqd $17,-48($sp)
8 stqd $18,-64($sp)
9 stqd $19,-80($sp)
10 stqd $20,-96($sp)
11 stqd $21,-112($sp)
12 stqd $22,-128($sp)
13 stqd $23,-144($sp)
14 stqd $24,-160($sp)
15 stqd $25,-176($sp)
16 ilh $25,515
17 stqd $26,-192($sp)
18 # iprefetch
19 lnop
20 stqd $27,-208($sp)
21 stqd $28,-224($sp)
22 stqd $29,-240($sp)
23 ilh $29,515
24 stqd $30,-256($sp)
25 stqd $31,-272($sp)
26 stqd $32,-288($sp)
27 nop $127
28 stqd $33,-304($sp)
29 ilh $33,515

無駄なレジスタを使用してしまうのは上記の2と同じ問題です。
inline展開を多用しますと、どうしても使用するレジスタ数が増えます。
無駄なレジスタ使用が 1 つあるとスタックへのロード・セーブでパイプラインの P1 が二つ埋まってしまいます。

無駄なnop

下記のトレース結果はme実行時のものです。
85,86にnopがありますが、これは無駄ではないでしょうか。
ただ、nop を削るとスケジュールが崩れる場合があるので、どういう準拠によってnop が挿入されるかわかりません。
89-92 で処理が空いているのは 87 で $26 の準備をしているのが間に合っていないためです。
80 で $10 の設定処理をしている部分と交換すれば、この無駄なサイクルを削ることが出来ます。

D Cycle 80: PC:00178 [P0:il $9,0x0000] PC:0017c [P1:lqa $10,0x015c0]
S Cycle 81: PC:00180 [P1:stqd $0,0x001($1)]
S Cycle 82: PC:00184 [P1:stqd $1,0x3f2($1)]
S Cycle 83: PC:00188 [P0:ai $1,$1,0x320] PC:0018c [P1:lnop ]
S Cycle 84: PC:00190 [P0:il $8,0x0000] PC:00194 [P1:hbra 0x01c,0x0068]
N Cycle 85: PC:00198 [P1:lnop ]
N Cycle 86: PC:0019c [P0:nopi ,0x00007f]
D Cycle 87: PC:001a0 [P0:ila $20,0x01600] PC:001a4 [P1:lqa $26,0x02000]
S Cycle 88: PC:001a8 [P1:lqa $25,0x02010]
R Cycle 89:
R Cycle 90:
R Cycle 91:
R Cycle 92:
S Cycle 93: PC:001ac [P0:mpyh $24,$9,$26]

[gcc, spu] spu版gcc最適化問題(2)

2004年作成

変数を増やした場合、使用するレジスタが少なくなる

以下のようなサンプルプログラムを作って使用するレジスタ数を比較してみました。
プログラムはperlスクリプトで自動生成されます。

% perl t22.pl <変数の数>
struct s {
int a;
int b;
};
struct s a000;
struct s a001;
struct s a002;
:
int f()
{
int a = 0;
int b = 0;
a += (a000.a ^ a001.b);
a += (a001.a ^ a002.b);
:
a += (a145.a ^ a146.b);
a += (a146.a ^ a000.b);
b += (a000.b ^ a146.a);
b += (a146.b ^ a145.a);
:
b += (a002.b ^ a001.a);
b += (a001.b ^ a000.a);
return a + b;
}

このプログラムを-O3でコンパイルして使用するレジスタ数を調査しました。
結果は次のようになります。

変数の数レジスタの数
80 87
90 97
100 107
110 117
120 127
130 127
140 127
147 127
148 123
149 116
150 113
160 86
170 66
180 55
190 43
200 44

変数の数が 147 までは使用できるレジスタを最大限に使うのですが、それからはレ
ジスタを使用しなくなって最終的に 43,44個に落ち着きます。

命令の順番を代えることによってサイクル数を少なくする

1 sub(char *a, char *b, char *c, int n)
2 {
3 int i;
4 5 for (i = 0; i < n; i++) {
6 c[i] = a[i] + b[i];
7 }
8 }

このソースを-O3でコンパイルすると以下のようになります。

6 sub:
7 ori $11,$3,0
8 cgti $3,$6,0
9 ori $12,$6,0
10 ori $10,$4,0
11 ori $9,$5,0
12 hbra .L9,.L6
13 lnop
14 nop $127
15 il $8,0
16 brz $3,.L8
17 .L6:
18 a $4,$11,$8 ; FX even 2
19 lqx $5,$10,$8 ; LS odd 5
20 a $14,$10,$8 ; FX even 2
21 lqx $2,$11,$8 ; LS odd 5
22 ai $6,$4,13 ; FX even 2
23 lqx $13,$9,$8 ; LS odd 5
24 ai $3,$14,13 ; FX even 2
25 cbx $14,$9,$8 ; SH odd 4
26 rotqby $7,$2,$6 ; SH odd 4
27 rotqby $6,$5,$3 ; SH odd 4
28 ah $5,$7,$6
29 shufb $4,$5,$13,$14
30 stqx $4,$9,$8
31 ai $8,$8,1
32 cgt $3,$12,$8
33 .L9:
34 brnz $3,.L6
35 .L8:

(18,19),(20,21),(22,23),(24,25)がそれぞれ同じサイクルで実行されます。
26 で参照している $2 の準備のために 3 サイクル (22,24 で 2 サイクルですが LS では 5 サイクル必要なため) 無駄になります。
19 行目と 21 行目を交換するか、26,27 行目を交換 (この場合、レジスタ $6 の置き換えが必要) をすれば、無駄なサイクルは 2 サイクルに減ります。

上記のような命令順序の調整でサイクル数を削減することが可能です。

整数演算の順序

1 int foo2(int a, int b, int c, int d)
2 {
3 return a + b + c + d;
4 5 }

上記の関数を-O3でコンパイルすると

1 foo2:
2 a $7,$3,$4
3 a $2,$7,$5
4 a $3,$2,$6
5 bi $lr

のようになります。

これは 2 行目の結果を 3 行目で、3 行目の結果を 4 行目で使用するため、二つの無駄なサイクルが発生します。
ここでは (((a+b)+c)+d) の順序で計算していますが、整数の加算・乗算の場合、順序は関係無いので、((a+b)+(c+d)) のように計算しても問題ありません。
この変更によって無駄なサイクルを減らすことが出来ます。(以下のパターンだと無駄なサイクル数が 1 になります)

1 foo2:
2 a $7,$3,$4
3 a $2,$5,$6
4 a $3,$2,$7
5 bi $lr

乗算の場合は加算よりも効果があります。

1 long foo1(long a, long b, long c, long d)
2 {
3 return a * b * c * d;
4 }

上記の関数をコンパイルした場合：

1 foo1:
2 mpyh $14,$3,$4
3 mpyh $7,$4,$3
4 mpyu $13,$3,$4
5 a $12,$14,$7
6 a $11,$12,$13
7 hbr .L2,$lr
8 mpyh $10,$11,$5
9 nop $127
10 mpyh $2,$5,$11
11 mpyu $9,$11,$5
12 a $5,$10,$2
13 a $4,$5,$9
14 mpyh $3,$6,$4
15 mpyh $8,$4,$6
16 mpyu $7,$4,$6
17 a $5,$8,$3
18 a $3,$5,$7

変更後

1 foo1:
2 mpyh $14,$3,$4
3 mpyh $7,$4,$3
4 mpyu $13,$3,$4
5 mpyh $8,$5,$6
6 mpyh $9,$6,$5
7 mpyu $10,$5,$6
8 hbr .L2,$lr
9 a $12,$14,$7
10 a $15,$8,$9
11 a $11,$12,$13
12 a $16,$15,$10
13 mpyh $13,$11,$16
14 mpyh $14,$16,$11
15 mpyu $15,$11,$16
16 a $5,$13,$14
17 a $3,$5,$15

計算回数は代わりませんが、乗算だけを詰めることで無駄なサイクルがなくなります。
この場合、9サイクル早くなりました。括弧で順序付けすることでも最適化されます。

減算、除算等にも使用できます。

無駄な型変換

1 char foo4(char a, char b, char c, char d)
2 {
3 return a + b + c + d;
4 }

上記のソースを-O3でコンパイルすると以下のアセンブラが生成されます。

1 foo4:
2 nop $127
3 hbr .L6,$lr
4 ah $7,$3,$4
5 ah $4,$5,$7
6 ah $2,$6,$4
7 xsbh $3,$2
8 xshw $2,$3
9 ori $3,$2,0
10 nop $127
11 nop $127
12 nop $127
13 .L6:
14 bi $lr

7行目でhalf wordからbyteへの型変換をやっていますので、8行目の処理が無駄です。

除算

定数がある場合。

1 foo(int a)
2 {
3 return a / 3;
4 }

これを-O3でコンパイルすると以下のようになります。
定数部分の最適化がされていません。

1 .file "c.c"
2 .text
3 .align 3
4 .global foo
5 .type foo, @function
6 foo:
7 il $2,3
8 lnop
9 heqi $2,0 不必要
10 hbrr 3f,1f
11 sfi $10,$3,0
12 sfi $11,$2,0
13 cgti $12,$3,-1
14 cgti $13,$2,-1 事前計算可能
15 selb $10,$10,$3,$12
16 selb $11,$11,$2,$13 事前計算可能
17 xor $13,$12,$13
18 clz $9,$11 事前計算可能
19 clz $6,$10
20 il $7,1
21 sf $9,$6,$9
22 ori $4,$10,0
23 shl $6,$11,$9
24 shl $7,$7,$9
25 il $5,0
26 1: clgt $8,$6,$4
27 sf $9,$6,$4
28 or $14,$5,$7
29 rotmi $7,$7,-1
30 rotmi $6,$6,-1
31 selb $4,$9,$4,$8
32 selb $5,$14,$5,$8
33 3: brnz $7,1b
34 2: sfi $10,$4,0
35 sfi $11,$5,0
36 selb $4,$10,$4,$12
37 selb $5,$5,$11,$13
38 ori $3,$5,0
39 nop $127
40 bi $lr

switch文

switch 文は case 毎にコードブロックを作成しテーブルを使ってそこに間接ジャンプするようになっています。
以下の特殊な場合、処理自体をまとめることで、無駄なジャンプを削ることが可能です。
1. 同じ変数への定数代入が多い。
2. 定数のリターン処理が多い。

1 foo(unsigned short a)
2 {
3 switch(a) {
4 case 0: return 1;
5 case 1: return 2;
6 case 2: return 3;
7 case 3: return 4;
8 case 4: return 5;
9 default: return 0;
10 }
11 }

これをコンパイルすると以下のようになります。

6 foo:
7 ila $2,65535
8 il $5,0
9 and $4,$3,$2
10 clgti $3,$4,4
11 nop $127
12 brnz $3,.L1
13 shli $7,$4,2
14 ila $3,.L9
15 a $6,$7,$3
16 lqd $5,0($6)
17 rotqby $4,$5,$6
18 bi $4
19 .align 2
20 .align 2
21 .L9:
22 .word .L3
23 .word .L4
24 .word .L5
25 .word .L6
26 .word .L7
27 .L3:
28 il $5,1
29 br .L1
30 .L4:
31 il $5,2
32 br .L1
33 .L5:
34 il $5,3
35 br .L1
36 .L6:
37 il $5,4
38 br .L1
39 .L7:
40 il $5,5
41 lnop
42 .L1:
43 ori $3,$5,0
44 bi $lr

この場合、case ブロックへのジャンプと case ブロックから脱出するためのジャンプの 2 回ジャンプが存在しますが、以下のようにすると 1 回ですみます。

6 foo:
7 ila $2,65535
8 il $5,0
9 and $4,$3,$2
10 clgti $3,$4,4
11 nop $127
12 brnz $3,.L1
13 shli $7,$4,2
14 ila $3,.L9
15 a $6,$7,$3
16 lqd $5,0($6)
17 .L1:
18 ori $3,$5,0
19 bi $lr
20 .align 2
21 .align 2
22 .L9:
23 .word 1
24 .word 2
25 .word 3
26 .word 4
27 .word 5

無効命令が削除されない

1 foo(int a)
2 {
3 if (a > 0) {
4 return 0;
5 }
6 if (a > 100) {
7 return 100;
8 }
9 return -1;
10 }

上記の命令では 7 行目の処理が行なわれることはありませんが、アセンブラファイルにはその処理が入っています。

6 foo:
7 cgti $9,$3,100
8 hbr .L4,$lr
9 cgti $5,$3,0
10 nop $127
11 ceqi $8,$9,0
12 ceqi $2,$5,0
13 ori $7,$8,100
14 il $6,0
15 selb $5,$6,$7,$2
16 ori $3,$5,0
17 nop $127
18 .L4:
19 bi $lr

マージ処理

条件分岐でまとめられる処理を SELB で一括処理しようとしていますが、しきれていません。

1 foo(int a, int b, int c)
2 {
3 if (a > b) {
4 a++;
5 b++;
6 c++;
7 } else {
8 a++;
9 b--;
10 c++;
11 }
12 return a + b + c;
13 }

上記のソースでは 4 行目と 8 行目、6 行目と 11 行目がそれぞれ同じものになっています。
これをコンパイルすると、以下のようになります。

6 foo:
7 cgt $13,$3,$4
8 hbr .L4,$lr
9 ori $2,$5,0
10 nop $127
11 ai $7,$3,1
12 ceqi $5,$13,0
13 ai $10,$4,-1
14 ai $14,$4,1
15 selb $12,$7,$7,$5
16 selb $4,$14,$10,$5
17 ai $11,$2,1
18 ceqi $8,$13,0
19 a $5,$12,$4
20 selb $7,$11,$11,$8
21 a $3,$5,$7
22 .L4:
23 bi $lr

selb を使ってまとめて処理しようとしていますが、15 行目と 20 行目の処理でもselb を使用しています。この部分はコピー処理と等価です。

登録: 投稿 (Atom)

2016年10月14日金曜日

2016年9月21日水曜日

2016年9月7日水曜日

2016年7月13日水曜日

2016年7月4日月曜日

2016年6月17日金曜日

2016年5月31日火曜日

2016年5月25日水曜日

flush-lines

2016年5月19日木曜日

2016年4月20日水曜日

2016年3月14日月曜日

ThreadStatic

2016年3月8日火曜日

Checked属性

Binding

2016年3月1日火曜日

2016年2月29日月曜日

2016年2月18日木曜日

2016年2月16日火曜日

2016年2月10日水曜日

2016年2月4日木曜日

整数を文字列に変換する

2016年1月18日月曜日

2016年1月14日木曜日

概要

操作

初期化

登録

コミット

履歴参照

取得

タグ

不要ファイルの削除

変更の取り消し

ファイル名付け直し

外部との連携

マージ

テキストベースでの成果送付

バイナリベースでの成果送付

2016年1月12日火曜日

セレクトした領域の背景色に応じて値を設定

既存のExcelファイルを開く

2016年1月8日金曜日

lambda式

2016年1月7日木曜日

スタックの使用での問題

sort

-

検索

map

準備

構文解析

ファイルを変数に読み込む

2016年1月6日水曜日

問題を回避する手段

変数のstore処理の無駄

ループ最適化

除算

整数演算の順序

インライン展開時のスタックポインタ処理

変数のstore処理の無駄

シャフル用のベクタに関して

ループ最適化

2004年作成

インライン展開時のスタックポインタ処理

レジスタ割り当て

定数

無駄なnop

変数を増やした場合、使用するレジスタが少なくなる

命令の順番を代えることによってサイクル数を少なくする

整数演算の順序

無駄な型変換

除算

switch文

無効命令が削除されない

マージ処理