Uma maneira de obter o histórico de bloqueios no PostgreSQL

Continuação do artigo " Tentando criar um análogo de ASH para PostgreSQL ".

O artigo será considerado e mostrado em consultas e exemplos específicos - que informações úteis podem ser obtidas usando o histórico da visualização pg_locks.
Advertência
Devido à novidade do tópico e ao período de teste incompleto, o artigo pode conter erros. Críticas e comentários são fortemente incentivados e esperados.

Dados de entrada


Histórico de envios de Pg_locks


archive_locking
CREATE TABLE archive_locking ( timepoint timestamp without time zone , locktype text , relation oid , mode text , tid xid , vtid text , pid integer , blocking_pids integer[] , granted boolean , queryid bigint ); 

De fato, a tabela é semelhante à tabela archive_pg_stat_activity descrita em mais detalhes aqui - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? e aqui - Uma tentativa de criar um análogo do ASH para o PostgreSQL.

Para preencher a coluna queryid, use a função

update_history_locking_by_queryid
 --update_history_locking_by_queryid.sql CREATE OR REPLACE FUNCTION update_history_locking_by_queryid() RETURNS boolean AS $$ DECLARE result boolean ; current_minute double precision ; start_minute integer ; finish_minute integer ; start_period timestamp without time zone ; finish_period timestamp without time zone ; lock_rec record ; endpoint_rec record ; current_hour_diff double precision ; BEGIN RAISE NOTICE '***update_history_locking_by_queryid'; result = TRUE ; current_minute = extract ( minute from now() ); SELECT * FROM endpoint WHERE is_need_monitoring INTO endpoint_rec ; current_hour_diff = endpoint_rec.hour_diff ; IF current_minute < 5 THEN RAISE NOTICE 'Current time is less than 5 minute.'; start_period = date_trunc('hour',now()) + (current_hour_diff * interval '1 hour'); finish_period = start_period - interval '5 minute' ; ELSE finish_minute = extract ( minute from now() ) / 5 ; start_minute = finish_minute - 1 ; start_period = date_trunc('hour',now()) + interval '1 minute'*start_minute*5+(current_hour_diff * interval '1 hour'); finish_period = date_trunc('hour',now()) + interval '1 minute'*finish_minute*5+(current_hour_diff * interval '1 hour') ; END IF ; RAISE NOTICE 'start_period = %', start_period; RAISE NOTICE 'finish_period = %', finish_period; FOR lock_rec IN WITH act_queryid AS ( SELECT pid , timepoint , query_start AS started , MAX(timepoint) OVER (PARTITION BY pid , query_start ) AS finished , queryid FROM activity_hist.history_pg_stat_activity WHERE timepoint BETWEEN start_period and finish_period GROUP BY pid , timepoint , query_start , queryid ), lock_pids AS ( SELECT hl.pid , hl.locktype , hl.mode , hl.timepoint , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.history_locking hl WHERE hl.timepoint between start_period and finish_period GROUP BY hl.pid , hl.locktype , hl.mode , hl.timepoint ) SELECT lp.pid , lp.locktype , lp.mode , lp.timepoint , aq.queryid FROM lock_pids lp LEFT OUTER JOIN act_queryid aq ON ( lp.pid = aq.pid AND lp.started BETWEEN aq.started AND aq.finished ) WHERE aq.queryid IS NOT NULL GROUP BY lp.pid , lp.locktype , lp.mode , lp.timepoint , aq.queryid LOOP UPDATE activity_hist.history_locking SET queryid = lock_rec.queryid WHERE pid = lock_rec.pid AND locktype = lock_rec.locktype AND mode = lock_rec.mode AND timepoint = lock_rec.timepoint ; END LOOP; RETURN result ; END $$ LANGUAGE plpgsql; 

Explicação: o valor da coluna queryid é atualizado na tabela history_locking e, ao criar uma nova seção para a tabela archive_locking, o valor será salvo em valores históricos.

Impressão


Informações gerais sobre os processos como um todo.

ESPERANDO FECHAMENTOS POR LOCKTYPES


Pedido
 WITH t AS ( SELECT locktype , mode , count(*) as total FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted GROUP BY locktype , mode ) SELECT locktype , mode , total * interval '1 second' as duration FROM t ORDER BY 3 DESC 

Exemplo
  |  AGUARDANDO FECHAMENTOS POR LOCKTYPES
 + -------------------- + ---------------------------- - + --------------------
 |  tipo de bloqueio |  modo |  duração
 + -------------------- + ---------------------------- - + --------------------
 |  transactionid |  Sharelock  19:39:26
 |  tupla |  AccessExclusiveLock |  00:03:35
 + -------------------- + ---------------------------- - + --------------------

TOMADAS DE FECHAMENTOS POR LOCKTYPES


Pedido
 WITH t AS ( SELECT locktype , mode , count(*) as total FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND granted GROUP BY locktype , mode ) SELECT locktype , mode , total * interval '1 second' as duration FROM t ORDER BY 3 DESC 

Exemplo
  |  TOMADAS DE FECHAMENTOS POR LOCKTYPES
 + -------------------- + ---------------------------- - + --------------------
 |  tipo de bloqueio |  modo |  duração
 + -------------------- + ---------------------------- - + --------------------
 |  relação |  RowExclusiveLock |  51:11:10
 |  virtualxid |  ExclusiveLock |  48:10:43
 |  transactionid |  ExclusiveLock |  44:24:53
 |  relação |  AccessShareLock |  20:06:13
 |  tupla |  AccessExclusiveLock |  17:58:47
 |  tupla |  ExclusiveLock |  01:40:41
 |  relação |  ShareUpdateExclusiveLock |  00:26:41
 |  objeto |  RowExclusiveLock |  00:00:01
 |  transactionid |  Sharelock  00:00:01
 |  estender |  ExclusiveLock |  00:00:01
 + -------------------- + ---------------------------- - + --------------------

Informações detalhadas sobre consultas específicas de consultas

AGUARDANDO FECHAMENTOS POR LOCKTYPES POR QUERYID


Pedido
 WITH lt AS ( SELECT pid , locktype , mode , timepoint , queryid , blocking_pids , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND queryid IS NOT NULL GROUP BY pid , locktype , mode , timepoint , queryid , blocking_pids ) SELECT lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids , COUNT(*) * interval '1 second' as duration FROM lt GROUP BY lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids ORDER BY 4 

Exemplo
  |  AGUARDANDO FECHAMENTOS POR LOCKTYPES QUERYID
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  pid |  tipo de bloqueio |  modo |  iniciado |  queryid |  blocking_pids |  duração
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  11288  transactionid |  Sharelock  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11626  transactionid |  Sharelock  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12380} |  00:00:29
 |  11626  transactionid |  Sharelock  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {11092} |  00:03:25
 |  11626  transactionid |  Sharelock  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12213} |  00:01:55
 |  11626  transactionid |  Sharelock  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12751} |  00:00:01
 |  11629  transactionid |  Sharelock  2019-09-17 10: 00: 24.331935 |  389015618226997618 |  {11092} |  00:03:22
 |  11629  transactionid |  Sharelock  2019-09-17 10: 00: 24.331935 |  389015618226997618 |  {12007} |  00:00:01
 |  12007  transactionid |  Sharelock  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11629} |  00:00:13
 |  12007  transactionid |  Sharelock  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11092} |  00:01:10
 |  12007  transactionid |  Sharelock  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11288} |  00:00:05
 |  12213  transactionid |  Sharelock  2019-09-17 10: 06: 07.328019 |  389015618226997618 |  {12007} |  00:00:10 

TOMANDO FECHAMENTOS POR LOCKTYPES POR QUERYID


Pedido
 WITH lt AS ( SELECT pid , locktype , mode , timepoint , queryid , blocking_pids , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND granted AND queryid IS NOT NULL GROUP BY pid , locktype , mode , timepoint , queryid , blocking_pids ) SELECT lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids , COUNT(*) * interval '1 second' as duration FROM lt GROUP BY lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids ORDER BY 4 

Exemplo
  |  TOMANDO FECHAMENTOS POR LOCKTYPES POR QUERYID
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  pid |  tipo de bloqueio |  modo |  iniciado |  queryid |  blocking_pids |  duração
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  11288  relação |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11092  transactionid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {}  00:03:34
 |  11288  relação |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {}  00:00:10
 |  11092  relação |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {}  00:03:34
 |  11092  virtualxid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {}  00:03:34
 |  11288  virtualxid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11288  transactionid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11288  tupla |  AccessExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34 

Usando o histórico de bloqueios ao analisar incidentes de desempenho.

  1. Uma solicitação com queryid = 389015618226997618 executada por um processo com pid = 11288 bloqueio esperado de 17/09/2019 às 10:00:00 por 3 minutos.
  2. O bloqueio foi realizado por um processo com pid = 11092
  3. Um processo com pid = 11092 executando uma consulta com queryid = 389015618226997618 a partir de 17/09/2019 às 10:00:00 reteve o bloqueio por 3 minutos.

Sumário


Agora, espero, comece o mais interessante e útil - coletar estatísticas e analisar casos sobre o histórico de expectativas e bloqueios.

No futuro, quero acreditar, será um conjunto de algum tipo de nota (semelhante ao Oracle Metalink).

Em geral, é por esse motivo que a técnica utilizada é exibida o mais rápido possível para a familiarização geral.

Em um futuro próximo, tentarei colocar o projeto no github.

Source: https://habr.com/ru/post/pt467719/


All Articles