Technical Memorandum

2011年1月27日木曜日

DOMのhashプロパティ×2

URLを構成する文字列のうち、"#…"の部分はフォーマルには "fragment identifier" と呼ばれるが、「フラグメント・アイデンティファイア」と発音しても他人に通じることは稀。"#"は一般的に「ハッシュマーク」や「シャープ」と呼ばれるから、日常会話では「ハッシュ以下」とか「シャープなんとか」などと言えば通じる。

JavaScriptでこの fragment identifier を参照する方法を整理しておく。

window.location.hash

ブラウザで現在表示しているURLの fragment identifier を知りたい場合、JavaScriptでは window.location.hash と書けばよいことが知られている（ちなみに、Mozillaのドキュメント window.location - MDC Doc Center には "DOM Level 0. Not part of any standard." と書いてあるので、W3Cで標準化されたものではない）。

HTMLAnchorElement.hash

location以外に fragment identifier が関係してくる場面としては、a要素のhref属性値を処理する場面がある。実はつい最近までhref属性値を文字列処理して"#…"の部分を取り出す、なんてことをしていたのだがその必要はない。アンカーオブジェクトにもhashプロパティが存在しているので、getElementしてhashプロパティを参照するだけで済む。次の例はアンカーのhref属性値"http://dminor11th.blogspot.com/index.html#foo"のうちの"#foo"をアラートで表示する。

例

<html>
  <head>
    <title>test</title>
  </head>
  <body>
    <p><a id="a1" href="http://dminor11th.blogspot.com/index.html#foo">anchor1</a></p>
    <script type="text/javascript">
      alert(document.getElementById('a1').hash);
    </script>
  </body>
</html>

ちなみに、このhashプロパティ、つまりHTMLAnchorElementのhashプロパティも、W3Cで標準化されたものではないようだ（DOM Level 2 HTML Spec.等に記載されていない）。Mozillaの HTMLAnchorElement - MDC Doc Center によるとHTML5には記載されるらしい。

参考

ハッシュマークとは - IT用語辞典 Weblio辞書

window.location - MDC Doc Center

Document Object Model HTML - Interface HTMLAnchorElement

HTMLAnchorElement - MDC Doc Center

4.6 Text-level semantics — HTML5

Text-level semantics - HTML5 API チェッカー - HTML5 チュートリアル - HTML5.JP

2011年1月16日日曜日

setuid なスクリプト

シェルやPerlで書かれたスクリプトも、setuidプログラムのように動作して欲しいときがある。その場合にどうすればいいか、という話（セキュリティの問題があるため仕事ではほとんど役に立たない、教養レベルの話）。

いろいろと方法があるだろうけれど、ここではそのうちの2つ記録しておく。

その1. `setuid(0)`と`system()`を用いたラッパープログラム

setuid on shell scripts に書いてあるとおり、シェルやPerl等のスクリプトファイルの場合は、setuidビットをセットして実行しても無視される。そこで、ラッパーとしてELFの実行ファイルをC言語で作成してsetuidビットをセットするのだが、その内部でただ単にsystem(…)関数でスクリプトを実行するだけでは足りない。system(…)の前に、setuid(0)を実行しないといけない。

…
int main()
{
   setuid( 0 );
   system( "/path/to/script.sh" );

   return 0;
}

ここで、setuid(0)が必要となる理由は、"man 3 system", "man sh" を調べると分かる。

man 3 system:: DESCRIPTION system() executes a command specified in command by calling /bin/sh -c command, and returns after the command has been completed. During execution of the command, SIGCHLD will be blocked, and SIGINT and SIGQUIT will be ignored.
man sh:: If the shell is started with the effective user (group) id not equal to the real user (group) id, and the -p option is not supplied, no startup files are read, shell functions are not inherited from the environment, the SHELLOPTS variable, if it appears in the environment, is ignored, and the effective user id is set to the real user id. If the -p option is supplied at invocation, the startup behavior is the same, but the effective user id is not reset.

整理すると、system("/path/to/script.sh") は /bin/sh -c "/path/to/script.sh" に変換される。この/bin/shは、effective use id=root で real user id=一般ユーザーの状態で実行されるから、effective user id が一般ユーザーのidで上書きされ、結局はsetuidプログラムのクレデンシャル（credential）が失われてしまう、という流れ。

これを防ぐには、effective use id と real user id が一致した状態でsystem(…)を呼べばよいから、事前にsetuid(0)を呼ぶことになる（setuid(0)により、effective use id と real use id の両方に 0 つまりrootが設定される）。

※UNIXプロセスのクレデンシャル関係の用語は、邦訳が揺らいでいて曖昧なので英語のままがいい。ちなみに、手元にある「Amazon.co.jp：詳解UNIXプログラミング: W.リチャードスティーヴンス, W.Richard Stevens, 大木敦雄: 本」では"real user id"が「実ユーザID」、"effective user id"が「実効ユーザID」と訳されている。

その2. `execve()` を用いたラッパープログラム

ここから別解。system()とsetuid(0)の使用をやめて、exec()系のexecve()を用いる方法。

なぜexecve()を使うかと言えば、man 3 systemおよび「IPA ISEC　セキュア・プログラミング講座：C/C++言語編　第10章著名な脆弱性対策：コマンド注入攻撃対策」で推奨されているから。要するに、スクリプトに引き継がれる環境変数を制御できるから。

man 3 system: Do not use system() from a program with set-user-ID or set-group-ID privileges, because strange values for some environment variables might be used to subvert system integrity. Use the exec(3) family of functions instead, but not execlp(3) or execvp(3).
IPA ISEC セキュア・プログラミング講座より引用:: これらのうち使用を推奨するのは、 execle execve execvP の3つである。なぜならば、環境変数 PATH が改ざんされていても影響を受けず、起動するプログラムに与える環境変数を制御できるからである。

同時に、実行される側、スクリプトのほうにも一工夫必要となる。具体的には、man shに記載のとおり、shebang行に-pオプション（ Turn on privileged mode）を加えて、子プロセスの effective user id の変更を防ぐ。

man sh:: -p Turn on privileged mode. In this mode, the $ENV and $BASH_ENV files are not processed, shell functions are not inherited from the environment, and the SHELLOPTS variable, if it appears in the environment, is ignored. If the shell is started with the effective user (group) id not equal to the real user (group) id, and the -p option is not supplied, these actions are taken and the effective user id is set to the real user id. If the -p option is supplied at startup, the effective user id is not reset. Turning this option off causes the effective user and group ids to be set to the real user and group ids.

以上をまとめると、次のようなコードになる。

`execve()`を使った場合のコード（C）

fork()してexecve()する流れ。コンパイル後、chmod で所有者root の 4755 に変更しておく。

/* setuid_test.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>
int main(int argc, char *argv[]){
  int pid;
  if((pid=fork()) < 0){
    return 1;
  }else if(pid > 0){
    wait(NULL);
  }else{
    if(execve("/path/to/script.sh", NULL, NULL) == -1){
      return 1;
    }
    return 0;
  }
  return 0;
}

スクリプトのコード（shebang行の -p が必要）

スクリプトのほうは chmod で所有者が一般ユーザーの 744 にしておく。

#!/bin/bash -p
tail /var/log/maillog;  # rootだけが読み込めるログを表示してみる

実行例

以下のような感じ（CentOS5で確認）。

$ ./setuid_test
Jan 16 10:11:16 example postfix/smtp[9884]: D446F10C4F9B: host mx.example.com[999.999.999.999] refused to talk to me: 421 Message from (999.999.999.999) temporarily deferred - …
Jan 16 10:11:17 example postfix/smtp[9884]: D446F10C4F9B: host mx.example.com[999.999.999.999] refused to talk to me: 421 Message from (999.999.999.999) temporarily deferred - …
…

ちなみに、shebang行の-pを消して実行した場合、execve("/path/to/script.sh", NULL, NULL)の後にsetuid(x)とsetgid(x)が勝手に実行されて、setuidプログラムのクレデンシャルが失われることが分かる（ここで、x は一般ユーザーのID）。以下は、その現象を strace で記録したもの。

$ id
uid=505(testuser) gid=505(testuser) groups=505(testuser)
$ strace -f -o strace.txt ./setuid_test
$ cat strace.txt
…略…
25971 execve("/path/to/script.sh", [0], [/* 0 vars */]) = 0
…略…
25971 getuid()                          = 505
25971 getgid()                          = 505
25971 geteuid()                         = 0
25971 getegid()                         = 505
25971 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
25971 setuid(505)                       = 0
25971 setgid(505)                       = 0
25971 open("/proc/meminfo", O_RDONLY)   = 3
…略…

参考

Macro-Defining Macros をEmacsで試した

Onlisp: Macro-Defining Macros に書いてある例題は、Emacs（GNU Emacs 23.1.1 (i386-mingw-nt6.1.7600)）でも使えた。長い名前をもつマクロを省略表記するための、"abbrev"というマクロ。いわゆるマクロを定義するマクロ。

abbrev

(defmacro abbrev (short long)
  `(defmacro ,short (&rest args)
     `(,',long ,@args)))

multiple-value-bind を mvbind に省略

(abbrev mvbind multiple-value-bind)
mvbind

;mvbindを強引に使ってみた例（与えられた数の商、剰余を出力）
((lambda (x y)
   (labels ((devide (x y) (list (/ x y) (% x y)))) ;商、剰余のリストを返す関数
     (mvbind (q r)
            (devide x y)
            (format "quotient: %d, remainder: %d" q r) )))
 ; 2011割る3を試してみる
 2011 3)
"quotient: 670, remainder: 1"

destructuring-bind を dbind に省略

(abbrev dbind destructuring-bind)
dbind

;dbindを強引に使ってみた例（与えられたリストからhtmlのaタグを生成）
((lambda (lst)
   (dbind
    (title href)
    lst
    (format "<a href=\"%s\" title=\"%s\">%s</a>" href blog blog))) ;リストの要素をaタグに変換
 '("title1" "http://dminor11th.blogspot.com/"))
"<a href=\"http://dminor11th.blogspot.com/\" title=\"blog\">blog</a>

参考

2011年1月15日土曜日

document.anchors と document.links

今まで気にかけていなかった、DOMの"document.anchors"と"document.links"の違いを整理。

name属性を持つa要素はアンカーとみなされるので、document.anchors に属する。
href属性を持つa要素はリンクとみなされるので、document.links に属する。
name属性とhref属性を両方とも持っているa要素は、document.links, document.anchors の両方に属する。
id属性の有無はアンカー／リンクの判別には無関係（IEでは仕様が異なり、id属性を持つ場合はアンカーとみなされる）

確認のための例

アラートでa要素のテキストノードを表示。a1とa3が document.anchors、a2とa3が document.links として表示される。

<html>
<head>
  <title>test</title>
</head>
<body>

<ul>
  <li><a name="a1">a1</a></li> <!-- anchor -->
  <li><a href="foo">a2</a></li> <!-- link -->
  <li><a name="a3" href="bar">a3</a></li> <!-- anchor, link -->
</ul>

<script type="text/javascript">
  function collectText(a){
    var stack = new Array();
    for(var i=0, n=a.length; i<n; i++){
      stack.push(a[i].childNodes[0].nodeValue);
    }
    return stack.join(', ');
  }
  alert('document.anchors[' + collectText(document.anchors) + ']');
  alert('document.links[' + collectText(document.links) + ']');
</script>

</body>
</html>

※IE-6, IE-7, IE-8, Firefox-3.6.13, Chrome-8.0.552.224, Safari-5.0.2で期待通りに動いた（Windows7）。

参考

2011年1月10日月曜日

PerlによるプログレッシブJPEGとベースラインJPEGの判別

JPEGファイルは、プログレッシブJPEGとベースラインJPEGに分類される（参考：Baseline JPEG and Progressive JPEG?）。

Webサイトで使用する画像の場合、プログレッシブのほうを採用するのが現在は一般的だと思われるが、携帯サイトの場合は事情が異なる。というのは、携帯電話の中にプログレッシブ非対応の機種があるため。

よって、この違いに意識的でないと、携帯サイトのテスト工程で「ある機種で写真が表示されません」ということが起こる。こうした状況、つまり「サーバーにあるプログレッシブJPEGを洗い出したい」場合に、Perlの Image::Info が使える（もちろん Perl以外の選択肢もある。参考：プログレッシブJPEGやCMYKモードのJPEGをUnix環境で判別（判定） | はやとも　-hayatomo.com-）

Image::Info インストール

CentOS5の場合、CPANシェルでインストールできた。

# perl -MCPAN -e shell
insatll Image::Info

Baseline／Progressive判別スクリプト

JPEGファイルのパスを引数として、単純に「ファイルパス: Baseline」「ファイルパス: Progressive」と出力するだけのプログラム。

#!/usr/bin/perl
use strict;
use warnings;
use Image::Info "image_info";

my $info = image_info($ARGV[0]);
print "$ARGV[0]: $info->{'JPEG_Type'}\n";

実行例

findでjpgを抽出してスクリプトに渡す（スクリプトは "./info_jpeg.pl" に置いてある）。

$ find /foo/bar/images/ -name '*jpg' -exec ./info_jpeg.pl {} \;
/foo/bar/images/pic01.jpg: Baseline
/foo/bar/images/pic02.jpg: Progressive
/foo/bar/images/pic03.jpg: Baseline

参考

補足

Image::Info の image_info メソッドによって得られる連想配列の中身。ここで、「JPEG_Type」がBaseline/Progressiveを表す。

 $VAR1 = {
          'width' => 240,
          'file_media_type' => 'image/jpeg',
          'file_ext' => 'jpg',
          'color_type' => 'YCbCr',
          'AdobeTransformVersion' => 100,
          'ColorComponents' => [
                                 [
                                   'Y',
                                   34,
                                   0
                                 ],
                                 [
                                   'Cb',
                                   17,
                                   1
                                 ],
                                 [
                                   'Cr',
                                   17,
                                   1
                                 ]
                               ],
          'ColorComponentsDecoded' => [
                                        {
                                          'ComponentIdentifier' => 'Y',
                                          'HorizontalSamplingFactor' => 2,
                                          'VerticalSamplingFactor' => 2,
                                          'QuantizationTableDesignator' => 0
                                        },
                                        {
                                          'ComponentIdentifier' => 'Cb',
                                          'HorizontalSamplingFactor' => 1,
                                          'VerticalSamplingFactor' => 1,
                                          'QuantizationTableDesignator' => 1
                                        },
                                        {
                                          'ComponentIdentifier' => 'Cr',
                                          'HorizontalSamplingFactor' => 1,
                                          'VerticalSamplingFactor' => 1,
                                          'QuantizationTableDesignator' => 1
                                        }
                                      ],
          'BitsPerSample' => [
                               8,
                               8,
                               8
                             ],
          'SamplesPerPixel' => 3,
          'JFIF_Version' => '1.02',
          'App12-Ducky' => '2',
          'AdobeTransformFlags' => [
                                     49152,
                                     0
                                   ],
          'AdobeTransform' => 1,
          'height' => 180,
          'resolution' => '100/100',
          'JPEG_Type' => 'Baseline'
        };

2010年12月24日金曜日

C-u M-x shell とその省力化

Emacs の中で shell というコマンドをよく利用する。仕事で使うOSはWindowsだけど、安易にマウスに手を伸ばさないで、目を閉じて、GNUのコマンド群やPerlのワンライナーを頭の中で組み合わせて、打鍵して、実行する。

その shellを使うにあたり、ただ単にM-x shellと実行すると、*shell*バッファが1個だけしか使えないのだが、並行作業が多いので複数のバッファを使いたくなる（顧客ごと、プロジェクトごとのバッファ）。

複数の *shell* を開くには、いわゆる「前置引数（prefix argument）」を使って shellコマンドを実行する。具体的には次のようにキーを打つ。

C-u M-x shell[Enter][Enter]

この結果、*shell*, *shell*<2>, *shell*<3>,...という具合に番号付きのバッファが順次生成されてゆく。

仕事上の要求はこれで満足されるが、それにしてもC-u M-x shell[Enter][Enter]と毎回打っていると面倒になってくる。なので、この操作をもっと省略するにはどうすればいいかを考える。

最初、 (shell) をラップする関数を定義して、それにキーを割り当てるという方法を思い付いたが、どうも前置引数関連の関数／変数の仕様がよく理解できず挫折（ (universal-argument), current-prefix-arg など）。

仕方が無いので、以下のようにキーボードマクロで実現することにした。

1. C-u M-x shellをマクロとして記録

次のようにキーを打つ。

C-x ( C-u M-x shell[Enter][Enter] C-x )

2. キーボーマクロに名前を付ける

名前はとりあえずたった一文字の「s」にすることにして、次の式を評価。

(name-last-kbd-macro 's)

この時点で、M-x s[Enter] を実行するとシェルが複数起動するようになるはず。

3. キーボードマクロの永続化

(insert-kbd-macro 's)

*scratch*バッファで上の式を評価すると、下のような出力が得られるはず。

(fset 's
   [?\C-u ?\M-x ?s ?h ?e ?l ?l return return])

これがキーボードマクロの定義。今後も使用するため、 .emacs, init.el 等に貼り付けておく。

参考

2010年12月16日木曜日

IE5.5 で MochiKit のセレクタ（MochiKit.Selector）を使うときに

ほとんどの仕事でIE6以上がサポート対象となるのだが、まれにIE5.5もサポートしなければならないこともある。

IE5.5で何が困るかと言うと、JavaScriptライブラリである。普段はもっぱら jQuery を使ってJavaScriptを書くのだが、残念ながらIE5.5ではエラーが発生してしまう。

ちょっとしたスクリプトなら組み込みのJavaScript関数だけで対応するが、それが効率的でない場合は jQueryに代わるライブラリを導入する。JavaScriptのいわゆる「軽量ライブラリ」はたくさんあるのだが、一体何を使うか？自分の場合は MochiKit を使う（IE5.5でエラーが発生しないのと、むかし使ったゆえの惰性もある）。

そのような状況で、つまり MochiKit と IE5.5 の組み合わせで頻繁に犯してしまうミスがあるので備忘のため書いておく。

Classセレクタ（Class Selector）は要素もセットで記述するべし

たとえばHTMLドキュメントに foo というclassをもつ div 要素があったとして、

NG

  MochiKit.DOM.addLoadEvent(function(){
    alert($$('.foo').length);
  });

このコードでは結果が 0 になってしまう。なぜか .fooだけでは目的の要素が取得できないのだな。

OK

  MochiKit.DOM.addLoadEvent(function(){
    alert($$('div.foo').length);
  });

セレクタに、要素 div も指定すると期待通りに 1 が得られる。

参考

登録: 投稿 (Atom)