Supported types¶
Each ClickHouse type is deserialized to a corresponding Python type when SELECT queries are prepared. When serializing INSERT queries, clickhouse-driver accepts a broader range of Python types. The following ClickHouse types are supported by clickhouse-driver:
[U]Int8/16/32/64/128/256¶
SELECT type: int.
Float32/64¶
INSERT types: float, int, long.
SELECT type: float.
Date/Date32¶
Date32 support is new in version 0.2.2.
SELECT type: date.
Since version 0.2.8 you can save memory in your application.
During package initialization special date lookup table (LUT) was always build for whole supported date range. It takes some time and ~40MB of memory. This table is used for fast Date columns (de)serialization. Now you can control LUT initialization process. There are 3 options:
Static initialization with whole date range like it was before 0.2.8. Default.
Lazy initialization. LUT will filled during Date columns processing. To enable this option set
CLICKHOUSE_DRIVER_LASY_DATE_LUTenvironment variable with non-empty value.Example
CLICKHOUSE_DRIVER_LASY_DATE_LUT=1.Lazy with static partial date range initialization. To enable this option set
CLICKHOUSE_DRIVER_LASY_DATE_LUTenvironment variable with desired date range. LUT will partially filled with specified date range. If some date is not in this interval LUT will be lazy updated.Example
CLICKHOUSE_DRIVER_LASY_DATE_LUT=2000-01-01:2030-01-01.
DateTime(‘timezone’)/DateTime64(‘timezone’)¶
Timezone support is new in version 0.0.11. DateTime64 support is new in version 0.1.3.
INSERT types: datetime, int, long.
Integers are interpreted as seconds without timezone (UNIX timestamps). Integers can be used when insertion of datetime column is a bottleneck.
SELECT type: datetime.
Setting use_client_time_zone is taken into consideration.
You can cast DateTime column to integers if you are facing performance issues when selecting large amount of rows.
Due to Python’s current limitations minimal DateTime64 resolution is one microsecond.
String/FixedString(N)¶
INSERT types: str, bytes. See note below.
SELECT type: str, bytes. See note below.
String column is encoded/decoded with encoding specified by strings_encoding setting. Default encoding is UTF-8.
You can specify custom encoding:
>>> settings = {'strings_encoding': 'cp1251'} >>> rows = client.execute( ... 'SELECT * FROM table_with_strings', ... settings=settings ... )
Encoding is applied to all string fields in query.
String columns can be returned without any decoding. In this case return values are bytes:
>>> settings = {'strings_as_bytes': True} >>> rows = client.execute( ... 'SELECT * FROM table_with_strings', ... settings=settings ... )
If a column has FixedString type, upon returning from SELECT it may contain trailing zeroes in accordance with ClickHouse’s storage format. Trailing zeroes are stripped by driver for convenience.
During SELECT, if a string cannot be decoded with specified encoding, it will return as bytes.
During INSERT, if strings_as_bytes setting is not specified and string cannot be encoded with encoding,
a UnicodeEncodeError will be raised.
Enum8/16¶
INSERT types: Enum, int, long, str.
SELECT type: str.
>>> from enum import IntEnum >>> >>> class MyEnum(IntEnum): ... foo = 1 ... bar = 2 ... >>> client.execute('DROP TABLE IF EXISTS test') [] >>> client.execute(''' ... CREATE TABLE test ... ( ... x Enum8('foo' = 1, 'bar' = 2) ... ) ENGINE = Memory ... ''') [] >>> client.execute( ... 'INSERT INTO test (x) VALUES', ... [{'x': MyEnum.foo}, {'x': 'bar'}, {'x': 1}] ... ) 3 >>> client.execute('SELECT * FROM test') [('foo',), ('bar',), ('foo',)]
Currently clickhouse-driver can’t handle empty enum value due to Python’s Enum mechanics. Enum member name must be not empty. See issue and workaround.
Array(T)¶
SELECT type: list.
Versions before 0.1.4: SELECT type: tuple.
>>> client.execute('DROP TABLE IF EXISTS test') [] >>> client.execute( ... 'CREATE TABLE test (x Array(Int32)) ' ... 'ENGINE = Memory' ... ) [] >>> client.execute( ... 'INSERT INTO test (x) VALUES', ... [{'x': [10, 20, 30]}, {'x': [11, 21, 31]}] ... ) 2 >>> client.execute('SELECT * FROM test') [((10, 20, 30),), ((11, 21, 31),)]
Nullable(T)¶
INSERT types: NoneType, T.
SELECT type: NoneType, T.
Bool¶
INSERT types: bool,
SELECT type: bool.
UUID¶
SELECT type: UUID.
Decimal¶
New in version 0.0.16.
INSERT types: Decimal, float, int, long.
SELECT type: Decimal.
Supported subtypes:
Decimal(P, S).
Decimal32(S).
Decimal64(S).
Decimal128(S).
Decimal256(S). New in version 0.2.1.
IPv4/IPv6¶
New in version 0.0.19.
INSERT types: IPv4Address/IPv6Address, int, long, str.
SELECT type: IPv4Address/IPv6Address.
>>> from ipaddress import IPv4Address, IPv6Address >>> >>> client.execute('DROP TABLE IF EXISTS test') [] >>> client.execute( ... 'CREATE TABLE test (x IPv4) ' ... 'ENGINE = Memory' ... ) [] >>> client.execute( ... 'INSERT INTO test (x) VALUES', [ ... {'x': '192.168.253.42'}, ... {'x': 167772161}, ... {'x': IPv4Address('192.168.253.42')} ... ]) 3 >>> client.execute('SELECT * FROM test') [(IPv4Address('192.168.253.42'),), (IPv4Address('10.0.0.1'),), (IPv4Address('192.168.253.42'),)] >>> >>> client.execute('DROP TABLE IF EXISTS test') [] >>> client.execute( ... 'CREATE TABLE test (x IPv6) ' ... 'ENGINE = Memory' ... ) [] >>> client.execute( ... 'INSERT INTO test (x) VALUES', [ ... {'x': '79f4:e698:45de:a59b:2765:28e3:8d3a:35ae'}, ... {'x': IPv6Address('12ff:0000:0000:0000:0000:0000:0000:0001')}, ... {'x': b"y\xf4\xe6\x98E\xde\xa5\x9b'e(\xe3\x8d:5\xae"} ... ]) 3 >>> client.execute('SELECT * FROM test') [(IPv6Address('79f4:e698:45de:a59b:2765:28e3:8d3a:35ae'),), (IPv6Address('12ff::1'),), (IPv6Address('79f4:e698:45de:a59b:2765:28e3:8d3a:35ae'),)] >>>
LowCardinality(T)¶
New in version 0.0.20.
INSERT types: T.
SELECT type: T.
SimpleAggregateFunction(F, T)¶
New in version 0.0.21.
INSERT types: T.
SELECT type: T.
AggregateFunctions for AggregatingMergeTree Engine are not supported.
Tuple(T1, T2, …)¶
New in version 0.1.4.
SELECT type: tuple.
Note
Currently, for ClickHouse server 23.3.1, JSON column Object('json')
and namedtuple column Tuple(b Int8) have the same binary
representation. There is no way to distinct one column from another without
additional inspection like DESCRIBE TABLE query. But this will
not work for complicated queries with joins.
To interpret ClickHouse namedtuple column alongside with
allow_experimental_object_type=1 as Python tuple set
namedtuple_as_json setting to False.
client.execute(..., settings={'namedtuple_as_json': False})
CREATE TABLE test (
a Tuple(b Int8),
c Object('json')
) ENGINE = Memory
INSERT INTO test VALUES ((1), '{"x": 2}');
>>> client.execute('SELECT * FROM test')
[((1,), (2,))]
>>> client.execute(
... 'SELECT * FROM test',
... settings={'allow_experimental_object_type': 1}
... )
[({'b': 1}, {'x': 2})]
>>> client.execute(
... 'SELECT * FROM test',
... settings={
... 'allow_experimental_object_type': 1,
... 'namedtuple_as_json': False
... }
... )
[((1,), (2,))]
Nested(flatten_nested=1, default)¶
Nested type is represented by sequence of arrays when flatten_nested=1. In example below actual
columns for are col.name and col.version.
:) CREATE TABLE test_nested (col Nested(name String, version UInt32)) Engine = Memory; CREATE TABLE test_nested ( `col` Nested(name String, version UInt32) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.005 sec. :) DESCRIBE TABLE test_nested FORMAT TSV; DESCRIBE TABLE test_nested FORMAT TSV col.name Array(String) col.version Array(UInt32) 2 rows in set. Elapsed: 0.004 sec.
Inserting data into nested column in clickhouse-client:
:) INSERT INTO test_nested VALUES (['a', 'b', 'c'], [100, 200, 300]); INSERT INTO test_nested VALUES Ok. 1 rows in set. Elapsed: 0.003 sec.
Inserting data into nested column with clickhouse-driver:
client.execute('INSERT INTO test_nested VALUES', [ (['a', 'b', 'c'], [100, 200, 300]), ])
Nested(flatten_nested=0)¶
Nested type is represented by array of named tuples when flatten_nested=0.
:) SET flatten_nested = 0; SET flatten_nested = 0 Ok. 0 rows in set. Elapsed: 0.006 sec. :) CREATE TABLE test_nested (col Nested(name String, version UInt32)) Engine = Memory; CREATE TABLE test_nested ( `col` Nested(name String, version UInt32) ) ENGINE = Memory Ok. 0 rows in set. Elapsed: 0.005 sec. :) DESCRIBE TABLE test_nested FORMAT TSV; DESCRIBE TABLE test_nested FORMAT TSV col Nested(name String, version UInt32) 1 rows in set. Elapsed: 0.004 sec.
Inserting data into nested column in clickhouse-client:
:) INSERT INTO test_nested VALUES ([('a', 100), ('b', 200), ('c', 300)]); INSERT INTO test_nested VALUES Ok. 1 rows in set. Elapsed: 0.003 sec.
Inserting data into nested column with clickhouse-driver:
client.execute( 'INSERT INTO test_nested VALUES', [ ([('a', 100), ('b', 200), ('c', 300)], ) ]) # or client.execute( 'INSERT INTO test_nested VALUES', [ {'col': [ {'name': 'a', 'version': 100}, {'name': 'b', 'version': 200}, {'name': 'c', 'version': 300} ]} ])
Map(key, value)¶
New in version 0.2.1.
INSERT types: dict.
SELECT type: dict.
Geo¶
New in version 0.2.4.
Point, Ring, Polygon, MultiPolygon.
These types are just aliases:
Point: Tuple(Float64, Float64)
Ring: Array(Point)
Polygon: Array(Ring)
MultiPolygon: Array(Polygon)
Object(‘json’)¶
New in version 0.2.6.
INSERT types: dict.
orjson and ujson implementations are supported for dumping data into
json during INSERT.
Set allow_experimental_object_type=1 for to enable json support.