CSSCAN(Character Set Scanner utility)是Oracle提供的一个用于检查字符集转换过程中可能会出现的数据丢失或者损坏的情况。也可以单独的扫描某些表某些列能否进行字符集转换,并且能够并行扫描以加快扫描速度。
1.创建相关数据字典
SYS@NING>@$Oracle_home\rdbms\admin\csminst.sql
SYS@NING>spool off
该脚本会创建一个叫csmig的用户。
2.查看csscan命令行帮助
Character Set Scanner v2.1 : Release 10.2.0.3.0 - Production on Thu Jul 5 09:43:45 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
You can let Scanner prompt you for parameters by entering the CSSCAN
command followed by your username/password:
Example: CSSCAN SYSTEM/MANAGER
Or, you can control how Scanner runs by entering the CSSCAN command
followed by various parameters. To specify parameters, you use keywords:
Example: CSSCAN SYSTEM/MANAGER FULL=y TOCHAR=utf8 ARRAY=1024000 PROCESS=3
Keyword Default Prompt Description
---------- ------- ------ -------------------------------------------------
USERID yes username/password
FULL N yes scan entire database
USER yes owner of tables to be scanned
TABLE yes list of tables to scan
COLUMN yes list of columns to scan
EXCLUDE list of tables to exclude from scan
TOCHAR yes new database character set name
FROMCHAR current database character set name
TONCHAR new national character set name
FROMNCHAR current national character set name
ARRAY 1024000 yes size of array fetch buffer
PROCESS 1 yes number of concurrent scan process
MAXBLOCKS split table if block size exceed MAXBLOCKS
CAPTURE N capture convertible data
SUPPRESS maximum number of exceptions logged for each table
FEEDBACK report progress every N rows
BOUNDARIES list of column size boundaries for summary report
LASTRPT N generate report of the last database scan
LOG scan base file name of report files
PARFILE parameter file name
PRESERVE N preserve existing scan results
LCSD N no enable language and character set detection
LCSDDATA LOSSY no define the scope of the detection
HELP N show help screen (this screen)
QUERY N select clause to scan subset of tables or columns
---------- ------- ------ -------------------------------------------------
Scanner terminated successfully.
3.执行csscan获得字符集报告
当前数据库的字符集为ZHS16GBK:
NAME VALUE$
---------------------------------------- ------------------------------
NLS_CHARACTERSET ZHS16GBK
假设要转换成UTF8,执行如下扫描:
Character Set Scanner v2.1 : Release 10.2.0.3.0 - Production on Thu Jul 5 10:08:56 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
Enumerating tables to scan...
. process 1 scanning SYS.ATTRIBUTE$[AAAAC1AABAAAAURAAA]
. process 3 scanning SYS.PARAMETER$[AAAAC1AABAAAAURAAA]
. process 2 scanning SYS.METHOD$[AAAAC1AABAAAAURAAA]
. process 2 scanning SYS.TYPE$[AAAAC1AABAAAAURAAA]
. process 3 scanning SYS.SOURCE$[AAAABIAABAAAAIRAAA]
. process 1 scanning SYS.LIBRARY$[AAAAACAABAAAAAZAAA]
......(省略若干行)
. process 1 scanning GLOBAL.AW$GLOBAL[AAAKdXAAFAAAACBAAA]
. process 2 scanning GLOBAL.AW$GLOBAL[AAAKdaAAGAAAABRAAA]
. process 3 scanning GLOBAL.AW$GLOBAL[AAAKdZAAGAAAABJAAA]
. process 1 scanning GLOBAL.AW$GLOBAL[AAAKdeAAGAAAABxAAA]
. process 2 scanning GLOBAL.AW$GLOBAL[AAAKddAAGAAAABpAAA]
Creating Database Scan Summary Report...
Creating Individual Exception Report...
Scanner terminated successfully.
4.查看扫描报告
默认情况下,csscan会在当前系统目录下,比如本例是E盘根目录下,生成一个叫san.out的日志文件,一个叫scan.err的错误记录文件和一个
叫scan.txt的报告。
本例的报告如下:
Time Started : 2007-07-05 10:08:58
Time Completed: 2007-07-05 10:11:05
Process ID Time Started Time Completed
---------- -------------------- --------------------
1 2007-07-05 10:09:17 2007-07-05 10:11:04
2 2007-07-05 10:09:17 2007-07-05 10:11:04
3 2007-07-05 10:09:17 2007-07-05 10:11:04
---------- -------------------- --------------------
[Database Size]
Tablespace Used Free Total Expansion
------------------- --------- --------------- --------------- -----------
SYSTEM 545.38M 4.63M 550.00M .00K
UNDOTBS1 95.63M 104.38M 200.00M .00K
SYSAUX 109.25M 10.75M 120.00M .00K
TEMP .00K .00K .00K .00K
USERS 64.00K 4.94M 5.00M .00K
GLOBAL_DATA 30.25M 3.25M 33.50M .00K
GLOBAL_TEMP .00K .00K .00K .00K
------------------- --------- --------------- --------------- -----------
Total 780.56M 127.94M 908.50M .00K
The size of the largest CLOB is 351364 bytes
[Database Scan Parameters]
Parameter Value
------------------------------ ----------------------------------------
CSSCAN Version v2.1
Instance Name ning
Database Version 10.2.0.3.0
Scan type Full database
Scan CHAR data? YES
Database character set ZHS16GBK
FROMCHAR ZHS16GBK
TOCHAR utf8
Scan NCHAR data? NO
Array fetch buffer size 1024000
Number of processes 3
Capture convertible data? NO
------------------------------ ----------------------------------------
[Scan Summary]
Some character type data in the data dictionary are not convertible to the new character set
All character type application data are convertible to the new character set
[Data Dictionary Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
--------------------- ----------- ------------ ----------- --------------
VARCHAR2 1,464,108 0 0 0
CHAR 0 0 0 0
LONG 110,255 0 0 0
CLOB 7,022 0 0 28
VARRAY 25 0 0 0
--------------------- ----------- ------------ ----------- --------------
Total 1,581,410 0 0 28
Total in percentage 99.998% 0.000% 0.000% 0.002%
The data dictionary can not be safely migrated using the CSALTER script
[Application Data Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
--------------------- ----------- ------------ ----------- --------------
VARCHAR2 6,866 210 0 0
CHAR 0 0 0 0
LONG 0 0 0 0
CLOB 0 0 0 0
VARRAY 0 0 0 0
--------------------- ----------- ------------ ----------- --------------
Total 6,866 210 0 0
Total in percentage 97.032% 2.968% 0.000% 0.000%
[Distribution of Convertible, Truncated and Lossy Data by Table]
USER.TABLE Convertible Truncation Lossy
-------------------------------------- ---------------- ---------- -------
GLOBAL.CUSTOMER_DIM 134 0 0
GLOBAL.PRODUCT_CHILD_PARENT 5 0 0
GLOBAL.PRODUCT_DIM 23 0 0
GLOBAL.TIME_DIM 24 0 0
GLOBAL.TIME_MONTH_DIM 24 0 0
SYS.WRH$_SQL_PLAN 0 0 28
-------------------------------------- ---------------- ---------- -------
[Distribution of Convertible, Truncated and Lossy Data by Column]
USER.TABLE|COLUMN Convertible Truncation Lossy
------------------------------------------- ------------ ---------- ------
GLOBAL.CUSTOMER_DIM|ACCOUNT_DSC_DUTCH 2 0 0
GLOBAL.CUSTOMER_DIM|ACCOUNT_DSC_FRENCH 10 0 0
GLOBAL.CUSTOMER_DIM|MARKET_SEGMENT_DSC_FRENCH 1 0 0
GLOBAL.CUSTOMER_DIM|REGION_DSC_DUTCH 8 0 0
GLOBAL.CUSTOMER_DIM|REGION_DSC_FRENCH 34 0 0
GLOBAL.CUSTOMER_DIM|SHIP_TO_DSC_DUTCH 2 0 0
GLOBAL.CUSTOMER_DIM|SHIP_TO_DSC_FRENCH 11 0 0
GLOBAL.CUSTOMER_DIM|TOTAL_MARKET_DSC_FRENCH 61 0 0
GLOBAL.CUSTOMER_DIM|WAREHOUSE_DSC_DUTCH 5 0 0
GLOBAL.PRODUCT_CHILD_PARENT|PRODUCT_DSC_FRENCH 5 0 0
GLOBAL.PRODUCT_DIM|CLASS_DSC_FRENCH 18 0 0
GLOBAL.PRODUCT_DIM|FAMILY_DSC_FRENCH 2 0 0
GLOBAL.PRODUCT_DIM|ITEM_DSC_FRENCH 3 0 0
GLOBAL.TIME_DIM|MONTH_DSC_FRENCH 24 0 0
GLOBAL.TIME_MONTH_DIM|MONTH_DSC_FRENCH 24 0 0
SYS.WRH$_SQL_PLAN|OTHER_XML 0 0 28
------------------------------------------- ------------ ---------- ------
[Indexes to be Rebuilt]
USER.INDEX on USER.TABLE(COLUMN)
-------------------------------------------------------------------------
从上面的报告中,我们可以发现SYS.WRH$_SQL_PLAN中有28行CLOB类型的数据在字符集转换中可能出现数据丢失,详细的情况可以通过查看
scan.err获得:
[Database Scan Parameters]
Parameter Value
------------------------------ ------------------------------------------
CSSCAN Version v2.1
Instance Name ning
Database Version 10.2.0.3.0
Scan type Full database
Scan CHAR data? YES
Database character set ZHS16GBK
FROMCHAR ZHS16GBK
TOCHAR utf8
Scan NCHAR data? NO
Array fetch buffer size 1024000
Number of processes 3
Capture convertible data? NO
------------------------------ ------------------------------------------
[Data Dictionary individual exceptions]
User : SYS
Table : WRH$_SQL_PLAN
Column: OTHER_XML
Type : CLOB
Number of Exceptions : 28
Max Post Conversion Data Size: 97844
ROWID Exception Type Size Cell Data(first 30 bytes)
------------------ ------------------ ----- ------------------------------
AAACMDAADAAAAvQAAP lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADBvAAG lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADC8AAX lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADCDAAC lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADCEAAF lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADCHAAM lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADGkAAD lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADI3AAG lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADJGAAG lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADM+AAk lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADM6AAa lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADMaAAo lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADMyAAq lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADN1AAO lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADN2AAS lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADN9AAA lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNNAAH lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNSAAE lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNVAAL lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNfAAE lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNhAAl lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNjAAU lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNnAAr lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNqAAJ lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNuAAr lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNvAAQ lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADNzAAW lossy conversion <other_xml><info type="db_vers
AAACMDAADAAADOCAAe lossy conversion <other_xml><info type="db_vers
------------------ ------------------ ----- ------------------------------
5.CSSCAN的版本
10gR1带了csscan2.0版本,10gR2自带了csscan2.1版。由于csscan的新版本不随patchset发布,所以在8i和9i中使用csscan的时候,最好从OTN下载最新的版本:http://technet.oracle.com/software/tech/globalization/content.html
参考文章:
Metalink:Availability & versioning of the Character Set Scanner(Note:179843.1)
Metalink:Use Scanner Utility before Altering the Database Character Set(Note:123670.1)