Array#sort_byのマニュアルを見ていたら便利そうなKernel.testの記述がありました。
Array#sortのマニュアルより抜粋
まずはsortとsort_byのbenchmark
require 'benchmark' a = (1..100000).map {rand(100000)} Benchmark.bm(10) do |b| b.report("Sort") { a.sort } b.report("Sort by") { a.sort_by {|a| a} } end produces: user system total real Sort 0.180000 0.000000 0.180000 ( 0.175469) Sort by 1.980000 0.040000 2.020000 ( 2.013586)
sort_by遅いね。
しかし、、、
ファイルのmtimeでのsort
However, consider the case where comparing the keys is a non-trivial operation. The following code sorts some files on modification time using the basic sort method. files = Dir["*"] sorted = files.sort {|a,b| File.new(a).mtime <=> File.new(b).mtime} sorted #=> ["mon", "tues", "wed", "thurs"]
mtimeで比較となると、まぁ私もFileオブジェクト使っちゃいますね。
Kernel#test の登場
しかし、これはFileオブジェクトを毎回2つ作っちゃうので効率的でない。
そこで登場するのがKernel#test。
へー知らんかった。めっちゃ便利。
This sort is inefficient: it generates two new File objects during every comparison. A slightly better technique is to use the Kernel#test method to generate the modification times directly. files = Dir["*"] sorted = files.sort { |a,b| test(?M, a) <=> test(?M, b) } sorted #=> ["mon", "tues", "wed", "thurs"]
まだ無駄がある
そらまぁ、Fileオブジェクトは毎回作らなくてもTimeオブジェクトは作成されます。
そこで登場するのが昨日のSchwartzian Transform(シュワルツ変換)ですね。
参考:rubyのsort_by / shuffle から学ぶシュワルツ変換とフィッシャー - イェーツのシャッフル - rochefort's blog
This still generates many unnecessary Time objects. A more efficient technique is to cache the sort keys (modification times in this case) before the sort. Perl users often call this approach a Schwartzian Transform, after Randal Schwartz. We construct a temporary array, where each element is an array containing our sort key along with the filename. We sort this array, and then extract the filename from the result. sorted = Dir["*"].collect { |f| [test(?M, f), f] }.sort.collect { |f| f[1] } sorted #=> ["mon", "tues", "wed", "thurs"]
そしてこれを内部的にやっているのがsort_by
This is exactly what sort_by does internally. sorted = Dir["*"].sort_by {|f| test(?M, f)} sorted #=> ["mon", "tues", "wed", "thurs"]
sort_byはシュワルツ変換なんです。sort_by が良くなる局面も多々有ります。
Kernel#testのオプション
たくさん便利なのがありました。
module function Kernel.#test (Ruby 2.3.0)
Test Returns Meaning "A" | Time | Last access time for file1 "b" | boolean | True if file1 is a block device "c" | boolean | True if file1 is a character device "C" | Time | Last change time for file1 "d" | boolean | True if file1 exists and is a directory "e" | boolean | True if file1 exists "f" | boolean | True if file1 exists and is a regular file "g" | boolean | True if file1 has the \CF{setgid} bit | | set (false under NT) "G" | boolean | True if file1 exists and has a group | | ownership equal to the caller's group "k" | boolean | True if file1 exists and has the sticky bit set "l" | boolean | True if file1 exists and is a symbolic link "M" | Time | Last modification time for file1 "o" | boolean | True if file1 exists and is owned by | | the caller's effective uid "O" | boolean | True if file1 exists and is owned by | | the caller's real uid "p" | boolean | True if file1 exists and is a fifo "r" | boolean | True if file1 is readable by the effective | | uid/gid of the caller "R" | boolean | True if file is readable by the real | | uid/gid of the caller "s" | int/nil | If file1 has nonzero size, return the size, | | otherwise return nil "S" | boolean | True if file1 exists and is a socket "u" | boolean | True if file1 has the setuid bit set "w" | boolean | True if file1 exists and is writable by | | the effective uid/gid "W" | boolean | True if file1 exists and is writable by | | the real uid/gid "x" | boolean | True if file1 exists and is executable by | | the effective uid/gid "X" | boolean | True if file1 exists and is executable by | | the real uid/gid "z" | boolean | True if file1 exists and has a zero length Tests that take two files: "-" | boolean | True if file1 and file2 are identical "=" | boolean | True if the modification times of file1 | | and file2 are equal "<" | boolean | True if the modification time of file1 | | is prior to that of file2 ">" | boolean | True if the modification time of file1 | | is after that of file2