エッジ成分の方向分布と空間的配置に着目した情景画像からの文字列抽出

北田, 英樹; KITADA, Hideki

このアイテムのアクセス数:53件（2025-07-16 09:29 集計）

Permalink : https://hdl.handle.net/10114/8707

閲覧可能ファイル

ファイル	フォーマット	サイズ	閲覧回数	説明
北田　英樹	pdf	4.10 MB	67

論文情報

ファイル出力

アイテムタイプ	学位論文
タイトル	エッジ成分の方向分布と空間的配置に着目した情景画像からの文字列抽出
その他のタイトル	Character String Extraction from Scene Images Using Direction Distribution of Edge Components and their Spatial Arrangements
著者	著者名北田, 英樹
著者	著者名 KITADA, Hideki
言語	jpn
発行年	2013-03-24
著者版フラグ	Not Applicable (or Unknown)
学位授与年月日	2013-03-24
学位名	修士(理学)
学位授与機関	機関名法政大学 (Hosei University)
内容記述	情報科学研究科情報科学専攻; 指導教授: 若原徹
抄録	本研究では，カラー情景画像を対象に，RGB成分ごとの微分画像に2値化処理を施し，雑音成分を除去して融合し，文字列候補領域となるエッジ成分を抽出し，エッジ成分の外接矩形の空間的配置の条件から文字候補領域を選択し，単一文字候補領域として残された領域や接触文字列候補領域に対して，方向分布特徴を用いた文字／非文字判定を用いて文字列を抽出する手法を提案した.特に，空間的配置条件を利用した手法では抽出が不可能であった接触文字列のエッジ成分においても抽出を可能とする手段として，仮分割を施して分割されたエッジ成分ごとに文字／非文字判定を行うことで，抽出精度の向上を図った. 提案手法をICDAR2003の公開データセットに適用し，再現率71.3%，適合率64.5%，F値67.8%を達成した.
抄録	This paper proposes a method for character string extraction from scene images using direction distribution of edge components and their spatial arrangements. First, we detect edge components using Canny operator as applied to individual RGB channels and label the connected edge components. Second, we count the number of edge points of labeled edge component and calculate the maximum of x-coordinate, y-coordinate, and minimum of them. Third, we remove such edge components that have too few edge points, too large size of width and/or height, and show too large difference between width and height. Fourth, we generate character string candidates by adding up of edge components obtained through individual RGB channels. Fifth, we extract character strings based on spatial arrangements of edge components. Sixth, we extract isolated characters using direction distribution features of edge components. Finally, we extract character strings composed of concatenated characters and isolated characters based on tentative segmentation and character recognition by evaluating character likeness of individual components. Experimental results made on a total of 249 images extracted from ICDAR2003 robust reading and Text Locating dataset “SceneTrialTest” show that the proposed method achieves a recall rate of 71.3%, a precision rate of 64.5%, and an F measure of 67.8%.
資源タイプ	Thesis
インデックス	資料タイプ別＞学位論文＞修士論文＞情報科学研究科
インデックス	109 情報科学部・情報科学研究科＞学位論文＞修士論文