pg4n package

Submodules

pg4n.cmp_domain_checker module

class pg4n.cmp_domain_checker.CmpContext(expression, a, b)[source]

Bases: object

a: Column
b: Column
expression: Predicate
class pg4n.cmp_domain_checker.CmpDomainChecker(parsed_sql, columns)[source]

Bases: object

check()[source]

Does analysis for suspicous comparisons between different domains. e.g., comparing columns off type VARCHAR(20) and VARCHAR(50) Returns a warning message if something was found, otherwise None.

Return type

Optional[str]

pg4n.config_parser module

class pg4n.config_parser.ConfigParser(file)[source]

Bases: object

parse()[source]

Reads config values from file givein in __init__.

Return type

Optional[ConfigValues]

pg4n.config_reader module

class pg4n.config_reader.ConfigReader[source]

Bases: object

read()[source]

The configuration files are read in order from: /etc/pg4n.conf then from $XDG_CONFIG_HOME/pg4n.conf, or if $XDG_CONFIG_HOME is not set, from $HOME/.config/pg4n.conf, and lastly from $PWD/pg4n.conf, with each new value introduced in latter files overriding the previous value.

Return type

Optional[ConfigValues]

pg4n.config_values module

class pg4n.config_values.ConfigValues[source]

Bases: TypedDict

CmpDomain: bool
EqWildcard: bool
ImpliedExpression: bool
InconsistentExpression: bool
StrangeHaving: bool
SubqueryOrderBy: bool
SubquerySelect: bool
SumDistinct: bool

pg4n.eq_wildcard_checker module

class pg4n.eq_wildcard_checker.EqWildcardChecker(parsed_sql, qep_analysis)[source]

Bases: object

check()[source]

Returns warning message if the sql has equals operation to a string with wild card character (the ‘%’ character), otherwise None.

Return type

Optional[str]

pg4n.errfmt module

class pg4n.errfmt.ErrorFormatter(warning_msg, warning_name, underlined_query=None)[source]

Bases: object

format()[source]

Returns a formatted error message.

Return type

str

pg4n.implied_expression_checker module

class pg4n.implied_expression_checker.ImpliedExpressionChecker(parsed_sql, sql_statement, db_connection)[source]

Bases: object

check()[source]

Returns warning_msg if implied expression is detected, otherwise None.

Return type

Optional[str]

pg4n.inconsistent_expression_checker module

class pg4n.inconsistent_expression_checker.InconsistentExpressionChecker(parsed_sql, qep_analysis)[source]

Bases: object

Inconsistent expression is some expression that is never true. For example: x = 10 AND x = 20

This checker only finds a small subset of such expression, where postgresql itself detects the inconsistent expression in its query optimizer and exposes that information via its query execution plan.

check()[source]

Returns warning msg if the sql contains inconsistent expression, otherwise None.

Return type

Optional[str]

pg4n.main module

pg4n.main.main()[source]

Initiate session by getting psql connection parameters via psql child process, initializing semantic analysis and wrapper modules, and then starting the session.

Return type

None

pg4n.psqlconninfo module

class pg4n.psqlconninfo.PsqlConnInfo(psql_args)[source]

Bases: object

Get PostgreSQL server address, port, database name, and user via psql by supplying same arguments as the original psql process. This way we can avoid writing a command-line argument parser.

get()[source]

Get 5-tuple that has the PostgreSQL host, port, user, pass, and db name.

Returns

Optional[Tuple[str, str, str, str, str]] – (hostname, port, user, password, database name) 5-tuple or None.

tok_database(name=None): ParserElement = W:(0-9A-Z_a-zªµ·...)
tok_end(name=None): ParserElement = '".'
tok_host(name=None): ParserElement = W:(.-9A-Z_a-zªµ·...)
tok_port(name=None): ParserElement = W:(0-9)
tok_pre_database(name=None): ParserElement = 'You are connected to database "'
tok_pre_host(name=None): ParserElement = {'" on host "' | '" via socket in "'}
tok_pre_port(name=None): ParserElement = '" at port "'
tok_pre_user(name=None): ParserElement = '" as user "'
tok_user(name=None): ParserElement = W:(0-9A-Z_a-zªµ·...)

pg4n.psqlparser module

Parse psql output.

class pg4n.psqlparser.PsqlParser[source]

Bases: object

Parses psql output for syntactic analysis.

debug: bool = False
multiline_prompt_ends: list[str] = ['-#', '*#', "'#", '"#', '$#', '(#', '->', '*>', "'>", '">', '$>', '(>', '-*#', '**#', "'*#", '"*#', '$*#', '(*#', '-*>', '**>', "'*>", '"*>', '$*>', '(*>', '-!#', '*!#', "'!#", '"!#', '$!#', '(!#', '-!>', '*!>', "'!>", '"!>', '$!>', '(!>', '-?#', '*?#', "'?#", '"?#', '$?#', '(?#', '-?>', '*?>', "'?>", '"?>', '$?>', '(?>']
output_has_magical_return(psql)[source]

Check for weird Return presses.

Parameters

psql (str) – Raw console output that includes terminal control codes.

Returns

bool – if output has a weird Return press.

output_has_new_prompt(psql)[source]

Detect when psql query evaluation has ended by parsing for a new prompt.

Parameters

psql (str) – Raw console output that includes terminal control codes.

Returns

bool – if output is a fresh prompt.

parse_last_stmt(psql)[source]

Parse for last SQL query statement in a string.

Parameters

psql (str) – screenscraped psql string with only whitespace after most recent query.

Returns

str – parsed SQL query as plain string.

parse_new_prompt_and_rest(psql)[source]

Parse for a fresh prompt and everything preceding it into 2-length list, facilitating easy message injection.

Parameters

psql (str) – Raw console output that includes terminal control codes.

Returns

list[str] – a two-part list with everything before the prompt line and then the prompt, to allow easy message injection.

List is empty if no fresh prompt was found.

parse_psql_version(psql)[source]

Parse for psql version and return version number.

Parameters

psql (str) – psql –version output

Returns

str – version string (e.g “14.5”)

parse_syntax_error(psql)[source]

Parse for syntax error output.

Parameters

psql (str) – screen-scraped psql output.

Returns

str – syntax error message from ‘ERROR:’ to last ‘^’.

prompt_chars: str = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyzªµ·ºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ'
stmt_chars: str = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
stmt_end: str = ';'
tok_rev_prompt_end(name=None): ParserElement = Combine:({{[<SP><TAB><CR><LF>] {'#' | '>'}} [{'*' | '!'} | '?']} {'=' | '^'})

pg4n.psqlwrapper module

Interface with psql and capture all input and output.

Uses pexpect in combination with pyte for interfacing and screen-scraping respectively.

class pg4n.psqlwrapper.PsqlWrapper(psql_args, hook_semantic_f, hook_syntax_f, parser)[source]

Bases: object

Handles terminal interfacing with psql, using the parameter parser to pick up relevant SQL statements and syntax errors for hook functions.

debug: bool = False
start()[source]

Start psql process and feed hook functions with intercepted queries and syntax errors.

Control is only returned after psql process exits.

Return type

None

supported_psql_versions: list[str] = ['14.5']

pg4n.qepparser module

class pg4n.qepparser.QEPAnalysis(qep_)[source]

Bases: object

Represents the result of EXPLAIN ANALYZE.

property plan: Plan

A dict of the root node’s properties.

Return type

Plan

property qep: QEP

A dict of the query execution plan’s properties.

Return type

QEP

property root: QEPNode

The root node of the query execution plan.

Return type

QEPNode

class pg4n.qepparser.QEPNode(node_)[source]

Bases: object

A node in a query execution plan.

find(pr, recursive=False)[source]

Finds nodes matching the predicate.

Parameters
  • pr (Callable[[Plan], bool]) – a function that takes a node and returns True if it matches

  • recursive – if True, search recursively, otherwise only search this+children

Returns

list[Plan] – a list of matching nodes

findval(key, val, recursive=False)[source]

Finds nodes with the given key and value.

Parameters
  • key (str) – the key to search for

  • val (object) – the value to search for

  • recursive – if True, search recursively, otherwise only search this+children

Returns

list[Plan] – a list of matching nodes

property plan: Plan

A dict of the node’s properties.

Return type

Plan

property plans: list[pg4n.qepparser.Plan]

A list of the node’s children.

Return type

list[Plan]

rfind(pred)[source]

Finds nodes matching the predicate, recursively.

Parameters

pred (Callable[[Plan], bool]) – a function that takes a node and returns True if it matches

Returns

list[Plan] – a list of matching nodes

rfindval(key, val)[source]

Finds nodes with the given key and value, recursively.

Parameters
  • key (str) – the key to search for

  • val (object) – the value to search for

Returns

list[Plan] – a list of matching nodes

class pg4n.qepparser.QEPParser(*args, conn=None, constraint_exclusion=True, **kwargs)[source]

Bases: object

Performs analyses on given queries, returning resultant QEPAnalysis.

parse(stmt, *args, **kwargs)[source]

Alias for __call__

Return type

QEPAnalysis

pg4n.qepparser.node

alias of Plan

pg4n.qepparser.qep

alias of QEP

pg4n.semanticrouter module

Handle semantic analysis modules.

class pg4n.semanticrouter.SemanticRouter(pg_host, pg_port, pg_user, pg_pass, pg_name, config_values)[source]

Bases: object

Analyze given SQL queries via a plethora of analysis modules.

run_analysis(sql_query)[source]

Run analysis modules on SQL query string and get an insightful message in return.

Semantic router (some day) implements basic heuristics to avoid running all the modules on all queries. For now, it is dumb brute force router. :type sql_query: str :param sql_query: is a single well-formed query to run analytics on. :return: str – an insightful message that might include vt100-compatible control codes and newlines (without carriage returns).

pg4n.sqlparser module

class pg4n.sqlparser.Column(name, type, table)[source]

Bases: object

name: str
table: str
type: PostgreSQLDataType
class pg4n.sqlparser.PostgreSQLDataType(name, digits, precision)[source]

Bases: object

digits: Optional[int]
name: str
precision: Optional[int]
class pg4n.sqlparser.SqlParser(db_connection)[source]

Bases: object

static find_all_table_names(parsed_sql)[source]

Finds all unique table names in ‘parsed_sql’. Any table name with an alias is listed without the alias (e.g.: FROM <table> AS <alias>) Also removes duplicate table names from the result.

Return type

list[str]

static find_where_predicates(root)[source]

Finds all Predicate nodes inside Where statements from some root node (usually the whole parsed output from sqlparser.parse_one()). This function takes care to not introduce any duplicate Predicate’s in case of nested Where statements.

Return type

list[Predicate]

static get_column_name_from_column_expression(column_expression)[source]

Returns the column name of column expression ast node.

Return type

str

get_query_columns(parsed_sql)[source]

Gets all columns from all tables mentioned in parsed_sql.

Return type

list[Column]

static get_root_node(node)[source]

Finds the root node (probably exp.Select) from node.

Return type

Expression

parse(sql)[source]

Parses all the statements in ‘sql’.

‘sql’ should be a string of one or more postgresql statements (delimited by ‘;’). The last ‘;’ in ‘sql’ is optional. Raises whichever Exception sqlglot wants to throw on invalid sql.

Return type

list[Expression]

parse_one(sql)[source]

Parses the first statement in ‘sql’. Does not validate that ‘sql’ contains only 1 statement! ‘sql’ should be a postgresql statement. The trailing ‘;’ in ‘sql’ is optional. Raises whichever Exception sqlglot wants to throw on invalid sql.

Return type

Expression

pg4n.strange_having_checker module

class pg4n.strange_having_checker.StrangeHavingChecker(parsed_sql, qep_analysis)[source]

Bases: object

check()[source]

Returns warning message if there exists HAVING without a GROUP BY, otherwise None.

Return type

Optional[str]

pg4n.subquery_order_by_checker module

class pg4n.subquery_order_by_checker.SubqueryOrderByChecker(parsed_sql, qep_analysis)[source]

Bases: object

check()[source]

Returns warning message if there exists ORDER BY in a subquery, otherwise None.

This check gives misses some situations with redundant ORDER BY but should never give false positives, only false negatives.

Return type

Optional[str]

pg4n.subquery_select_checker module

class pg4n.subquery_select_checker.SubquerySelectChecker(parsed_sql, sql_parser)[source]

Bases: object

check()[source]

Returns warning message if there no column SELECTed in a subquery is not used in that subquery of its own columns, otherwise returns None.

Return type

Optional[str]

class pg4n.subquery_select_checker.SubquerySelectContext(subquery)[source]

Bases: object

subquery: Expression

pg4n.sum_distinct_checker module

class pg4n.sum_distinct_checker.SumDistinctChecker(parsed_sql, qep_analysis)[source]

Bases: object

check()[source]

Returns warning message if the sql has SUM/AVG(DISTINCT …), otherwise None

Return type

Optional[str]

pg4n.util module

Module contents