Reference
Limitations, Quirks, and Tradeoffs
CipherStash provides transparent encryption and decryption of sensitive data in SQL databases. It works by intercepting SQL queries and encrypting or decrypting the sensitive data on the fly. This reference document outlines the limitations, quirks, and tradeoffs of using CipherStash.
Limitations
Unsupported Features
When using CipherStash Proxy for PostgreSQL, some SQL features are not supported on columns encrypted by CipherStash.
However, those SQL features will continue to work on plaintext columns that are not encrypted by CipherStash.
When an unsupported feature is used, CipherStash Proxy will return an error.
Setting encrypted columns to arbitrary expressions in an UPDATE
1-- Not supported:
2UPDATE ...
3SET some_encrypted_col = some_encrypted_col + 1;
JOIN
on encrypted columns
It is generally not possible to JOIN
on encrypted columns that may contain the same plaintext:
1-- Not supported:
2SELECT t.id,
3 t.amount,
4 p.name
5FROM transactions t
6JOIN payee p ON t.payee_email = p.email
7ORDER BY t.name;
SQL statements using JOIN
on plaintext columns will run successfully.
JOIN
on encrypted columns with USING
or NATURAL
Any JOIN
using USING
or NATURAL
will error, regardless of whether the JOIN
is on an encrypted or plaintext column.
1-- Not supported:
2SELECT *
3FROM toy
4NATURAL JOIN cat;
5
6-- Not supported:
7SELECT post_id,
8 title,
9 review
10FROM post
11INNER JOIN post_comment USING(post_id);
INSERT
from other tables
1-- Not supported:
2INSERT INTO some_configured_table
3SELECT *
4FROM other_table;
Partitioning by encrypted columns
Partitioning splits what is logically one large table into smaller physical pieces.
Partitioning on encrypted columns is not supported:
1-- Not supported:
2CREATE TABLE measurement (
3 city_id int not null,
4 logdate date not null,
5 peaktemp int,
6 unitsales int
7) PARTITION BY RANGE (logdate);
8
9CREATE TABLE measurement_y2006m02 PARTITION OF measurement
10 FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
11
12CREATE TABLE measurement_y2006m03 PARTITION OF measurement
13 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
Partitioning on plaintext columns is supported.
Subqueries
1-- Not supported:
2SELECT id,
3 first_name,
4 last_name
5FROM employees
6WHERE department_id IN
7 (SELECT department_id
8 FROM departments
9 WHERE location_ID = 1700);
Temporary tables
1-- Not supported:
2CREATE
3TEMPORARY TABLE table_t (column1 INT);
Window functions
1-- Not supported:
2SELECT product_name,
3 price,
4 group_name,
5 AVG (price) OVER (PARTITION BY group_name)
6FROM products
7INNER JOIN product_groups USING (group_id);
Prepared statements params sequence
The params placeholders in prepared statements need to appear in sequential order. Each placeholder can only appear once.
1-- Supported:
2PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $1 or email like $2;
3
4-- Not supported:
5PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $2 or email like $1;
6
7-- Not supported:
8PREPARE find_user as SELECT id, first_name FROM users WHERE first_name like $1 or email like $1;
SQL features that differ from the upstream PostgreSQL client
These SQL features are either partially supported, or supported with workarounds.
Bulk UPDATE
UPDATE
statements with multiple modified rows are partially supported.
Bulk updates with literal values are supported:
1-- Supported:
2UPDATE accounts
3SET balance = 100
4WHERE id IN (1,
5 2,
6 3);
UPDATE
statements using values lists are not supported:
1-- Not supported:
2UPDATE table
3SET update_column = temp.value
4FROM (
5 VALUES ('foo', 'bar'), ('baz', 'qux'), ('et', 'cetera')
6) temp (id, value)
7WHERE key_column = temp.id;
UPDATE
statements (bulk or single record) that set columns to arbitrary expressions are not supported:
1-- Not supported:
2UPDATE account SET balance = balance + 1 WHERE id = 123.
UPDATE
statements directly from other tables are not supported:
1-- Not supported:
2UPDATE accounts
3SET contact_first_name = first_name,
4 contact_last_name = last_name
5FROM employees
6WHERE employees.id = accounts.sales_person;
Schema-level default values for encrypted columns
1-- Not supported:
2CREATE TABLE products (
3 product_no integer,
4 name text,
5 price numeric DEFAULT 9.99
6);
This can be worked around with application-level defaults (for example, in your ORM).
This will be supported in future releases.
Aggregate functions
Only COUNT()
is supported:
1-- Supported:
2SELECT COUNT(id)
3FROM users
4WHERE date_of_birth > '1984-01-01';
Other aggregate functions are not supported:
1-- Not supported:
2SELECT department_id,
3 SUM(salary)
4FROM employees
5GROUP BY department_id;
INSERT
with ON CONFLICT
Columns managed by CipherStash can only be used in the DO UPDATE
portion of ON CONFLICT
clauses. The values assigned to a CipherStash-managed column must be one of:
- A reference to the corresponding column in the special
EXCLUDED
table (example:EXCLUDED.cs_column1
). - A parameter reference (example:
$1
). Note that each parameter can only be referenced once in a single statement and must be used in ascending, numeric order. - A literal value (example:
'some value'
).
Columns managed by CipherStash cannot be used in:
- The conflict target.
- The
WHERE
condition of aDO UPDATE
action. - The value assigned to a column not managed by CipherStash.
1-- Supported:
2INSERT INTO table_a (not_cs_column1, cs_column1)
3VALUES ('value1', 'value2'),
4 ('value3', 'value4')
5ON CONFLICT (not_cs_column1)
6 DO UPDATE SET cs_column1 = EXCLUDED.cs_column1;
7
8-- Supported:
9INSERT INTO table_a (not_cs_column1, cs_column1)
10VALUES ('value1', 'value2'),
11 ('value3', 'value4')
12ON CONFLICT ON CONSTRAINT constraint_name
13 DO UPDATE SET cs_column1 = EXCLUDED.cs_column1;
14
15-- Supported:
16INSERT INTO table_a (not_cs_column1, cs_column1)
17VALUES ($1, $2),
18 ($3, $4)
19ON CONFLICT (not_cs_column1)
20 DO UPDATE SET cs_column1 = $5;
21
22-- Supported:
23INSERT INTO table_a (not_cs_column1, cs_column1)
24VALUES ('value1', 'value2'),
25 ('value3', 'value4')
26ON CONFLICT (not_cs_column1)
27 DO UPDATE SET cs_column1 = 'some literal value';
28
29-- Not supported:
30INSERT INTO table_a (cs_column1, cs_column2)
31VALUES ('value1', 'value2'),
32 ('value3', 'value4'),
33 ('value5', 'value6')
34ON CONFLICT (cs_column1) -- Can't use a CipherStash-managed column as the conflict target.
35 DO NOTHING;
36
37-- Not supported:
38INSERT INTO table_a AS a (not_cs_column1, not_cs_column2, cs_column1)
39VALUES ('value1', 'value2', 'value3'),
40 ('value4', 'value5', 'value6')
41ON CONFLICT (not_cs_column1)
42 DO UPDATE SET not_cs_column2 = EXCLUDED.not_cs_column2
43 WHERE a.cs_column1 > 100; -- Can't use a CipherStash-managed column in the `WHERE` condition here.
44
45-- Not supported:
46INSERT INTO table_a (not_cs_column1, not_cs_column2, cs_column1)
47VALUES ('value1', 'value2', 'value3'),
48 ('value4', 'value5', 'value6')
49ON CONFLICT (not_cs_column1)
50 DO UPDATE SET not_cs_column2 = EXCLUDED.cs_column1; -- Can't assign a CipherStash-managed column to a non-CipherStash-managed column.
Columns not managed by CipherStash can be used anywhere in ON CONFLICT
clauses (except for in values assigned to columns that are managed by CipherStash):
1-- Supported:
2INSERT INTO table_a (not_cs_column1, not_cs_column2)
3VALUES ('value1', 'value2'),
4 ('value3', 'value4'),
5 ('value5', 'value6')
6ON CONFLICT (not_cs_column1) DO NOTHING;
7
8-- Supported:
9INSERT INTO table_a (not_cs_column1, not_cs_column2)
10VALUES ('value1', 'value2'),
11 ('value3', 'value4'),
12 ('value5', 'value6')
13ON CONFLICT (not_cs_column1)
14 DO UPDATE SET not_cs_column2 = EXCLUDED.not_cs_column2;
15
16-- Not supported:
17INSERT INTO table_a (not_cs_column1, not_cs_column2, cs_column1)
18VALUES ('value1', 'value2', 'value3'),
19 ('value4', 'value5', 'value6')
20ON CONFLICT (not_cs_column1)
21 DO UPDATE SET cs_column1 = EXCLUDED.not_cs_column2; -- Can't assign a non-CipherStash-managed column to a CipherStash-managed column.
Rails 7+ uses
ON CONFLICT ... DO UPDATE
for#upsert_all
.
Queries that are not allowed
Queries on encrypted columns are not allowed
Queries that directly access the encrypted columns are not allowed. Encrypted columns have the following patterns:
__[field]_encrypted
__[field]_ore
__[field]_match
__[field]_unique
1-- Not allowed:
2SELECT __name_encrypted FROM users WHERE id = 1;
3
4-- Not allowed:
5SELECT id, name FROM users WHERE __name_encrypted = "John Doe";
6
7-- Not allowed:
8UPDATE users SET __name_encrypted = "John Doe" WHERE id = 1;
9
10-- Not allowed:
11INSERT INTO users (name, __name_encrypted) VALUES ("John Doe", "John Doe");
Quirks
CipherStash has the following quirks:
Match queries against text fields can have false positives. CipherStash uses bloom filters to perform queries against some indexed columns. However, bloom filters can have false positives, which means that a query result may contain records that do not actually match the query. This problem is not unique to CipherStash and is a known tradeoff of using bloom filters. To minimize false positives, index tuning is required.
LIKE
andILIKE
queries require minor modification. ForLIKE
andILIKE
queries to continue to work inencrypted-duplicate
andencrypted
modes, letter case functions need to be removed:1-- Before: 2SELECT name FROM user WHERE lower(name) LIKE "%Alice%"; 3 4-- After: 5SELECT name FROM user WHERE name LIKE "%Alice%";
NULL
s are not yet represented in an encrypted form. BecauseNULL
s don't yet have an encrypted representation, the driver cannot detect if aNULL
in an encrypted source column (__x_encrypted
for example) reflects the actual plaintext source value. This can lead to data loss during migrations. CipherStash intends to ship support for encryptedNULL
s in the future.Exact queries against citext (case-insensitive text) fields are case-sensitive Let's suppose a PostgreSQL table
users
with a fieldemail
of typecitext
. Inplaintext-duplicate
mode, an email with the valueSam_Fisher@example.com
would match any equal case-insensitive lookups such as:1SELECT email from users where email = 'sam_FISHER@example.com'; 2SELECT email from users where email = 'sam_fisher@example.com'; 3SELECT email from users where email = 'SAM_FISHER@EXAMPLE.COM';
In order to support this query in
encrypted-duplicate
andencrypted
modes, CipherStash uses the unique index column. The default behaviour for this index column is case-sensitive, which will cause the above queries to not return the record as expected. In order to preserve the case-insensitive lookup behaviour, make sure that the unique index is defined with the downcase token filter.1tables: 2 - path: users 3 fields: 4 - name: email 5 in_place: false 6 cast_type: utf8-str 7 mode: encrypted-duplicate 8 indexes: 9 - version: 1 10 kind: unique 11 token_filters: 12 - kind: downcase
This will lowercase the values before encrypting and inserting them into the index column. It will also lowercase any query terms before queries are performed.
Tradeoffs
CipherStash makes the following tradeoffs:
- The same ciphertext may appear in multiple rows during bulk operations. When transforming various queries (such as
UPDATE
) that do bulk operations, multiple rows can be set with the same encrypted value. This can make it easier for attackers to perform certain types of inference attacks. Ideally, the same plaintext value in multiple rows and fields should have a unique ciphertext. CipherStash intends to ship a release that fixes this in the future. - Encrypted left ciphertext. CipherStash stores the encrypted left ciphertext (as defined in the Order Revealing Encryption paper), which can make it easier for attackers to use inference attacks to determine the plaintext. This is a tradeoff that is made to provide better performance and usability.
CipherStash is the easiest and safest way to protect structured sensitive data in your organisation. By understanding its limitations, quirks, and tradeoffs, users can make informed choices about whether CipherStash is the right solution for their applications.