API

This part of the documentation covers basic classes of the driver: Client, Connection and others.

Client

class clickhouse_driver.Client(*args, **kwargs)

Client for communication with the ClickHouse server. Single connection is established per each connected instance of the client.

Parameters:
  • settings – Dictionary of settings that passed to every query (except for the client settings, see below). Defaults to None (no additional settings). See all available settings in ClickHouse docs.
  • **kwargs – All other args are passed to the Connection constructor.

The following keys when passed in settings are used for configuring the client itself:

  • insert_block_size – chunk size to split rows for INSERT. Defaults to 1048576.
  • strings_as_bytes – turns off string column encoding/decoding.
  • strings_encoding – specifies string encoding. UTF-8 by default.
  • use_numpy – Use NumPy for columns reading. New in version
    0.2.0.
  • opentelemetry_traceparent – OpenTelemetry traceparent header as
    described by W3C Trace Context recommendation. New in version 0.2.2.
  • opentelemetry_tracestate – OpenTelemetry tracestate header as
    described by W3C Trace Context recommendation. New in version 0.2.2.
  • quota_key – A string to differentiate quotas when the user have
    keyed quotas configured on server. New in version 0.2.3.
  • input_format_null_as_default – Initialize null fields with
    default values if data type of this field is not nullable. Does not work for NumPy. Default: False. New in version 0.2.4.
disconnect()

Disconnects from the server.

execute(query, params=None, with_column_types=False, external_tables=None, query_id=None, settings=None, types_check=False, columnar=False)

Executes query.

Establishes new connection if it wasn’t established yet. After query execution connection remains intact for next queries. If connection can’t be reused it will be closed and new connection will be created.

Parameters:
  • query – query that will be send to server.
  • params – substitution parameters for SELECT queries and data for INSERT queries. Data for INSERT can be list, tuple or GeneratorType. Defaults to None (no parameters or data).
  • with_column_types – if specified column names and types will be returned alongside with result. Defaults to False.
  • external_tables – external tables to send. Defaults to None (no external tables).
  • query_id – the query identifier. If no query id specified ClickHouse server will generate it.
  • settings – dictionary of query settings. Defaults to None (no additional settings).
  • types_check – enables type checking of data for INSERT queries. Causes additional overhead. Defaults to False.
  • columnar – if specified the result of the SELECT query will be returned in column-oriented form. It also allows to INSERT data in columnar form. Defaults to False (row-like form).
Returns:

  • number of inserted rows for INSERT queries with data. Returning rows count from INSERT FROM SELECT is not supported.
  • if with_column_types=False: list of tuples with rows/columns.
  • if with_column_types=True: tuple of 2 elements:
    • The first element is list of tuples with rows/columns.
    • The second element information is about columns: names and types.

execute_iter(query, params=None, with_column_types=False, external_tables=None, query_id=None, settings=None, types_check=False, chunk_size=1)

New in version 0.0.14.

Executes SELECT query with results streaming. See, Streaming results.

Parameters:
  • query – query that will be send to server.
  • params – substitution parameters for SELECT queries and data for INSERT queries. Data for INSERT can be list, tuple or GeneratorType. Defaults to None (no parameters or data).
  • with_column_types – if specified column names and types will be returned alongside with result. Defaults to False.
  • external_tables – external tables to send. Defaults to None (no external tables).
  • query_id – the query identifier. If no query id specified ClickHouse server will generate it.
  • settings – dictionary of query settings. Defaults to None (no additional settings).
  • types_check – enables type checking of data for INSERT queries. Causes additional overhead. Defaults to False.
  • chunk_size – chunk query results.
Returns:

IterQueryResult proxy.

execute_with_progress(query, params=None, with_column_types=False, external_tables=None, query_id=None, settings=None, types_check=False, columnar=False)

Executes SELECT query with progress information. See, Selecting data with progress statistics.

Parameters:
  • query – query that will be send to server.
  • params – substitution parameters for SELECT queries and data for INSERT queries. Data for INSERT can be list, tuple or GeneratorType. Defaults to None (no parameters or data).
  • with_column_types – if specified column names and types will be returned alongside with result. Defaults to False.
  • external_tables – external tables to send. Defaults to None (no external tables).
  • query_id – the query identifier. If no query id specified ClickHouse server will generate it.
  • settings – dictionary of query settings. Defaults to None (no additional settings).
  • types_check – enables type checking of data for INSERT queries. Causes additional overhead. Defaults to False.
  • columnar – if specified the result will be returned in column-oriented form. Defaults to False (row-like form).
Returns:

ProgressQueryResult proxy.

classmethod from_url(url)

Return a client configured from the given URL.

For example:

clickhouse://[user:password]@localhost:9000/default
clickhouses://[user:password]@localhost:9440/default
Three URL schemes are supported:
clickhouse:// creates a normal TCP socket connection clickhouses:// creates a SSL wrapped TCP socket connection

Any additional querystring arguments will be passed along to the Connection class’s initializer.

insert_dataframe(query, dataframe, external_tables=None, query_id=None, settings=None)

New in version 0.2.0.

Inserts pandas DataFrame with specified query.

Parameters:
  • query – query that will be send to server.
  • dataframe – pandas DataFrame.
  • external_tables – external tables to send. Defaults to None (no external tables).
  • query_id – the query identifier. If no query id specified ClickHouse server will generate it.
  • settings – dictionary of query settings. Defaults to None (no additional settings).
Returns:

number of inserted rows.

query_dataframe(query, params=None, external_tables=None, query_id=None, settings=None)

New in version 0.2.0.

Queries DataFrame with specified SELECT query.

Parameters:
  • query – query that will be send to server.
  • params – substitution parameters. Defaults to None (no parameters or data).
  • external_tables – external tables to send. Defaults to None (no external tables).
  • query_id – the query identifier. If no query id specified ClickHouse server will generate it.
  • settings – dictionary of query settings. Defaults to None (no additional settings).
Returns:

pandas DataFrame.

Connection

class clickhouse_driver.connection.Connection(host, port=None, database='', user='default', password='', client_name='python-driver', connect_timeout=10, send_receive_timeout=300, sync_request_timeout=5, compress_block_size=1048576, compression=False, secure=False, verify=True, ssl_version=None, ca_certs=None, ciphers=None, keyfile=None, certfile=None, alt_hosts=None, settings_is_important=False)

Represents connection between client and ClickHouse server.

Parameters:
  • host – host with running ClickHouse server.
  • port – port ClickHouse server is bound to. Defaults to 9000 if connection is not secured and to 9440 if connection is secured.
  • database – database connect to. Defaults to 'default'.
  • user – database user. Defaults to 'default'.
  • password – user’s password. Defaults to '' (no password).
  • client_name – this name will appear in server logs. Defaults to 'python-driver'.
  • connect_timeout – timeout for establishing connection. Defaults to 10 seconds.
  • send_receive_timeout – timeout for sending and receiving data. Defaults to 300 seconds.
  • sync_request_timeout – timeout for server ping. Defaults to 5 seconds.
  • compress_block_size – size of compressed block to send. Defaults to 1048576.
  • compression

    specifies whether or not use compression. Defaults to False. Possible choices:

    • True is equivalent to 'lz4'.
    • 'lz4'.
    • 'lz4hc' high-compression variant of 'lz4'.
    • 'zstd'.
  • secure – establish secure connection. Defaults to False.
  • verify – specifies whether a certificate is required and whether it will be validated after connection. Defaults to True.
  • ssl_version – see ssl.wrap_socket() docs.
  • ca_certs – see ssl.wrap_socket() docs.
  • ciphers – see ssl.wrap_socket() docs.
  • keyfile – see ssl.wrap_socket() docs.
  • certfile – see ssl.wrap_socket() docs.
  • alt_hosts – list of alternative hosts for connection. Example: alt_hosts=host1:port1,host2:port2.
  • settings_is_importantFalse means unknown settings will be ignored, True means that the query will fail with UNKNOWN_SETTING error. Defaults to False.
disconnect()

Closes connection between server and client. Frees resources: e.g. closes socket.

QueryResult

class clickhouse_driver.result.QueryResult(packet_generator, with_column_types=False, columnar=False)

Stores query result from multiple blocks.

get_result()
Returns:stored query result.

ProgressQueryResult

class clickhouse_driver.result.ProgressQueryResult(*args, **kwargs)

Stores query result and progress information from multiple blocks. Provides iteration over query progress.

get_result()
Returns:stored query result.

IterQueryResult

class clickhouse_driver.result.IterQueryResult(packet_generator, with_column_types=False)

Provides iteration over returned data by chunks (streaming by chunks).