如何简化Oracle数据库研究:“绅士的脚本”集

下午好 我们是Rostelecom数据管理部门之一的系统分析师团队。 我们公司拥有300多种异构数据源-各种各样的数据源对于支持Rostelecom在所有众多领域的工作都是必不可少的。 我们研究数据源,并在必要时将其部分上传到存储循环。


表面上的好奇心流浪者。 它还具有许多异构数据源。 图片取自therahnuma.com。

在此过程中,区分了两个子任务:定义一种策略,用于根据源表的属性从源表收集数据,并准备作为数据仓库“接收方”的表。 为此,我们使用各种GUI和反向工程工具。 另外,在收集信息时,系统分析师开始获取对DBMS(主要是Oracle)的信息表的辅助查询池。 在本文中,我将分享我们团队使用的此类脚本的“绅士组合”。

首先,简要列出所有脚本:

  • 许多脚本使用xmlagg来聚合字符串,因为listagg无法处理串联导致的太长的字符串。
  • 在除“过程,函数和包”之外的所有脚本中,目标表都是通过“ with”块中的过滤器表指定的。 填写方案名称和表名称。
  • 每个脚本都附带一个或多个用例,规范说明(结果集)以及已用系统表的列表(以评估在特定数据库上使用的可能性)。

脚本“关于表的信息”


规格
栏名
评注
SCHEMA_NAME
数据架构名称(OWNER)
TABLE_NAME
表名
评论
在桌子上评论
身高
表中的行数(大约)
宽度
列数
DATETIME_COLUMNS
具有临时数据类型的列和基于名称的列(大概是时间戳)(模式-%期间%,%日期%,%时间%)
AVG_ROW_LEN
平均字符串长度(以字节为单位)
PART_KEY
分区列
SUBPART_KEY
分区列

使用的系统表: all_tab_columns,all_tab_comments,all_tab_statistics,all_part_key_columns,all_subpart_key_columns。

该查询对于确定从源系统上载数据的策略很有用。 如果主键建立在所考虑的表上,则可以组织卸载并在其上随后分配“增量”。 如果有时间戳记(例如,在技术领域中提供有关插入数据或更新的信息),则可以组织一段时间内仅上载已更改/添加的记录的上载。 创建类似的表“接收器”时,有关分区结构的信息会派上用场。

要求正文:

with filter (owner, table_name) as (    select 'SCHEMA_NAME_1', t.*    from table(        sys.odcivarchar2list(            'TABLE_NAME_1'            , 'TABLE_NAME_2'        )    ) t    union all    select        owner        , table_name    from        all_tables    where owner = 'SCHEMA_NAME_2' ) select    a.owner as schema_name    , a.table_name    , e.comments    , b.height    , c.width    , d.datetime_columns    , b.avg_row_len    , p.part_key    , s.subpart_key from    filter a    left join (        select            owner            , table_name            , num_rows as height            , avg_row_len        from all_tab_statistics        where object_type = 'TABLE'    ) b        on            a.table_name = b.table_name            and a.owner = b.owner    left join (        select            owner            , table_name            , count(1) as width        from all_tab_columns        group by            owner            , table_name    ) c        on            a.table_name = c.table_name            and a.owner = c.owner    left join (        select            owner            , table_name            , listagg(                column_name || ' (' || data_type || ')'                , ', '            ) within group (order by column_id) as datetime_columns        from all_tab_columns        where            data_type = 'DATE'            or data_type like 'TIMESTAMP%'            or data_type like 'INTERVAL%'            or lower(column_name) like '%period%'            or lower(column_name) like '%date%'            or lower(column_name) like '%time%'        group by            owner            , table_name    ) d        on            a.table_name = d.table_name            and a.owner = d.owner    left join (        select            owner            , table_name            , comments        from all_tab_comments        where table_type = 'TABLE'    ) e        on            a.table_name = e.table_name            and a.owner = e.owner    left join (        select            owner            , name as table_name            , listagg(                column_name                , ', '            ) within group (order by column_position) as part_key        from all_part_key_columns        where object_type = 'TABLE'        group by            owner            , name    ) p        on            a.owner = p.owner            and a.table_name = p.table_name    left join (        select            owner            , name as table_name            , listagg(                column_name                , ', '            ) within group (order by column_position) as subpart_key        from all_subpart_key_columns        where object_type = 'TABLE'        group by            owner            , name    ) s        on            a.owner = s.owner            and a.table_name = s.table_name order by    e.owner    , e.table_name ; 

脚本“分区和子分区”


规格:

栏名
评注
SCHEMA_NAME
数据架构名称(OWNER)
TABLE_NAME
表名
PART_KEY
分区列
PARTITION_NAME
分区名称
PARTITION_POSITION
分区号
PARTITION_HEIGHT
分区中的行数
SUBPART_KEY
分区列
SUBPARTITION_NAME
缔约方名称
SUBPARTITION_POSITION
子分区号
SUBPARTITION_HEIGHT
子方中的行数

使用的系统表: all_tab_partitions,all_tab_subpartitions,all_part_key_columns,all_subpart_key_columns。

当分区直接用作数据源时,该脚本对于获取分区的特征(名称,大小)很有用。

要求正文:

 with filter (owner, table_name) as (   select 'SCHEMA_NAME_1', t.*   from table(       sys.odcivarchar2list(           'TABLE_NAME_1'           , 'TABLE_NAME_2'       )   ) t   union all   select       owner       , table_name   from       all_tables   where owner = 'SCHEMA_NAME_2' ) select   f.owner as schema_name   , f.table_name   , p.part_key   , pc.partition_name   , pc.partition_position   , pc.num_rows as partition_height   , s.subpart_key   , sc.subpartition_name   , sc.subpartition_position   , sc.num_rows as subpartition_height from   filter f   join (       select           owner           , name as table_name           , listagg(               column_name               , ', '           ) within group (order by column_position) as part_key       from all_part_key_columns       where object_type = 'TABLE'       group by           owner           , name   ) p       on           f.owner = p.owner           and f.table_name = p.table_name   left join all_tab_partitions pc       on           p.table_name = pc.table_name           and p.owner = pc.table_owner   left join (       select           owner           , name as table_name           , listagg(               column_name               , ', '           ) within group (order by column_position) as subpart_key       from all_subpart_key_columns       where object_type = 'TABLE'       group by           owner           , name   ) s       on           p.owner = s.owner           and p.table_name = s.table_name   left join all_tab_subpartitions sc       on           f.owner = sc.table_owner           and f.table_name = sc.table_name           and pc.partition_name = sc.partition_name   order by       f.owner       , f.table_name ; 

脚本“表的属性组成”


规格:

栏名
评注
SCHEMA_NAME
数据架构名称(OWNER)
TABLE_NAME
表名
COLUMN_ID
列序号
COLUMN_NAME
栏名
DATA_TYPE
资料类型
COLUMN_COMMENT
专栏评论
标度
比例尺(数据类型为NUMBER)
精密度
准确性(数据类型为NUMBER)
BYTE_LENGTH
字段长度(以字节为单位)
编码
编码文本数据类型(CHAR,VARCHAR2,NCHAR,NVARCHAR2)
CHAR_LENGTH
文本数据类型(CHAR,VARCHAR2,NCHAR,NVARCHAR2)的最大字符串长度
NOT_NULL
标记“必填”
IS_PRIMARY
标记“包含在主键中”
DEFAULT_VALUE
预设值
COLUMN_IMPACT
其他表中引用列的列的列表。
COLUMN_DEPEND
该列引用的其他表中的列列表。

使用的系统表: all_tables,all_constraints,all_cons_columns,all_tab_columns,all_col_comments,v $ nls_parameters。

当需要有关表,表与其他表的关系以及完整的属性组成的详细信息时,该脚本对于在数据仓库中准备“接收器”表很有用。 filter2表用于过滤要搜索链接(从和到)的表。 默认情况下,表是从除系统方案以外的所有方案中获取的。

要求正文:

 with filter (owner, table_name) as (   select 'SCHEMA_NAME_1', t.*   from table(       sys.odcivarchar2list(           'TABLE_NAME_1'           , 'TABLE_NAME_2'       )   ) t   union all   select       owner       , table_name   from       all_tables   where owner = 'SCHEMA_NAME_2' ) , filter2 (owner, table_name) as (   select owner, table_name   from all_tables   where owner not in ('MDSYS', 'CTXSYS', 'SYSTEM', 'XDB', 'SYS') ) , refs as (   select       b.constraint_type as from_constraint_type       , b.constraint_name as from_constraint_name       , d.position as from_position       , d.column_name as from_column_name       , b.table_name as from_table_name       , b.owner as from_owner       , a.owner as to_owner       , a.table_name as to_table_name       , c.column_name as to_column_name       , c.position as to_position       , a.constraint_name as to_constraint_name       , a.constraint_type as to_constraint_type   from       all_constraints a       left join all_constraints b           on               a.r_constraint_name = b.constraint_name               and a.r_owner = b.owner       left join all_cons_columns c           on               a.constraint_name = c.constraint_name               and a.table_name = c.table_name               and a.owner = c.owner       left join all_cons_columns d           on               b.constraint_name = d.constraint_name               and b.table_name = d.table_name               and b.owner = d.owner       where           a.constraint_type = 'R'           and b.constraint_type in ('P', 'U')           and c.position = d.position ) , depends as (   select       rtrim(           xmlagg(               xmlelement(                   e                   , to_owner || '.' || to_table_name || '.' || to_column_name                   , ', '               ).extract('//text()')               order by to_owner           ).getclobval()           , ', '       ) as val       , from_owner as owner       , from_table_name as table_name       , from_column_name as column_name   from refs   where (to_owner, to_table_name) in (select * from filter2)   group by       from_table_name       , from_column_name       , from_owner ) , impacts as (   select       rtrim(           xmlagg(               xmlelement(                   e                   , from_owner || '.' || from_table_name || '.' || from_column_name                   , ', '               ).extract('//text()')               order by from_owner           ).getclobval()           , ', '       ) as val       , to_owner as owner       , to_table_name as table_name       , to_column_name as column_name   from refs   where (from_owner, from_table_name) in (select * from filter2)   group by       to_table_name       , to_column_name       , to_owner ) select   f.owner as schema_name   , f.table_name   , a.column_id   , a.column_name   , a.data_type   , b.comments as column_comment   /*         precision  ,    38       (      )       ,    scale,    0 (  ). */   , decode (       a.data_type       , 'NUMBER', nvl(a.data_scale, 0)       , ''   ) as scale   , decode (       a.data_type       , 'NUMBER', nvl(a.data_precision, 38)       , ''   ) as precision   /*             CHAR, VARCHAR2           DDL-   ,    NCHAR or NVARCHAR2        .*/   , a.data_length as byte_length   , case       when a.data_type in ('CHAR', 'VARCHAR2', 'NCHAR', 'NVARCHAR2')       then d.value   end as encoding   , case       when a.data_type in ('CHAR', 'VARCHAR2', 'NCHAR', 'NVARCHAR2')       then a.char_length --a.char_col_decl_length   end as char_length   , decode(a.nullable, 'Y', 'N', 'Y') as not_null   , decode(c.is_primary, 1, 'Y', 'N') as is_primary   , a.data_default as default_value   , impacts.val as column_impact   , depends.val as column_depend from   filter f   left join all_tab_columns a       on           f.owner = a.owner           and f.table_name = a.table_name   left join all_col_comments b       on           a.owner = b.owner           and a.table_name = b.table_name           and a.column_name = b.column_name   left join (       select           1 as is_primary           , owner           , table_name           , column_name       from all_cons_columns       where (owner, constraint_name) in (           select owner, constraint_name           from all_constraints           where constraint_type = 'P'       )   ) c       on           a.owner = c.owner           and a.table_name = c.table_name           and a.column_name = c.column_name   left join v$nls_parameters d       on decode (           a.character_set_name           , 'CHAR_CS', 'NLS_CHARACTERSET'           , 'NCHAR_CS', 'NLS_NCHAR_CHARACTERSET'           , a.character_set_name       ) = d.parameter   left join depends       on           a.owner = depends.owner           and a.table_name = depends.table_name           and a.column_name = depends.column_name   left join impacts       on           a.owner = impacts.owner           and a.table_name = impacts.table_name           and a.column_name = impacts.column_name order by   f.owner   , f.table_name   , a.column_id ; 

脚本“程序,函数和包”


规格:
栏名
评注
SCHEMA_NAME
数据架构名称(OWNER)
姓名
过程/功能/包/包头的名称
身材
身材
型式
类型(包装体,包装,功能,程序)
包好
标记“是否已编码主体(已包装)”

二手系统表: all_source

在分析源时,可能会出现一项任务,以研究系统中循环的数据流。 没有软件包,功能和过程的代码库,几乎是不可能的,尤其是在文档不完整或缺少的情况下。 为方便起见,通过脚本列出的对象可以表的形式表示。 可以使用控制台实用程序将请求的结果输出到流中,并通过简单的处理程序(bash脚本)将其重定向到文件,以供您喜欢的编辑器进一步研究。 此外,各种处理程序可以“挂”在输出流上-美化,展开等。

要求正文:

 select   t.owner as schema_name   , t.name as name   , xmlagg(       xmlelement(           e           , t.text           , ''       ).extract('//text()')       order by t.line asc   ).getclobval() as body   , f.wrapped   , t.type as type from (   select       owner, name, type       , case           when lower(text) like '%wrapped%' then 1           else 0       end as wrapped   from all_source   where type in (       'PACKAGE BODY'       , 'PACKAGE'       , 'FUNCTION'       , 'PROCEDURE'   )   and line = 1   and owner not in ('MDSYS', 'CTXSYS', 'SYSTEM', 'XDB', 'SYS') ) f join all_source t on   f.owner = t.owner   and f.name = t.name   and f.type = t.type group by   t.owner   , t.name   , t.type   , f.wrapped order by   t.owner   , t.name   , t.type ; 

结论


上面描述的脚本帮助我们的系统分析员摆脱了收集数据库信息的许多常规任务,并专注于更具创造性的事物,例如加载策略和“接收器”表的结构。 我希望这些脚本也对您有用。 知道如何自动化这些任务和类似任务将很有趣。

Source: https://habr.com/ru/post/zh-CN442082/


All Articles