API

Coming soon: Proof of SQL is currently in alpha, and the API specs are not publicly available. To request access, join the Space and Time beta and reach out to the team.

About tamperproof queries

Tamperproof queries require two parties: a verifier and a prover. For the Space and Time architecture, the Validator nodes play the role of the verifier, and the database clusters play the role of the prover.

The process consists of two types of interactions: proof ingestion, and query requests.

Proof ingestion / DML

When a service or a client sends data that is to be added to the database, that data is routed to the verifier so that the verifier can create a commitment to that data. This commitment is a small "digest" of the data but holds enough information for the rest of the protocol to ensure that the data is not tampered with. After creating this commitment, the verifier can route the data to the database for storage. The verifier stores this commitment for later usage.

An important design constraint is that the commitment/digest must be updatable. In other words, the verifier must be able to combine the old commitment with the incoming data to create a new commitment to the entire updated table. The key constraint here is that the verifier must be able to do this without access to the old existing data.

Query requests/DQL

When a service or a client sends a query request, that request is routed through the gateway to the prover, which is the database. At this point, the prover parses the query, computes the correct result, and produces a proof of SQL. It then sends both the query result, and the proof to the verifier.

Once the verifier has this proof, it can use the commitment to check the proof against the result and verify that the prover has produced the correct result to the query request. This query result can then be routed back to the service or client along with a tamperproof success flag. However, if the proof does not pass, the verifier does not route the result, but sends a failure message instead.

Commitments

The biggest computational cost revolves around the commitments and their uses. So, the decision of which commitment scheme we use is important. Fortunately, the commitment scheme that we chose can be treated as a "black box", so this choice can be modified as the product evolves.

Today, Proof of SQL uses Pedersen commitments, although we will soon be transitioning to a novel alternative with improved properties. We chose to start with these because of their simplicity, updateability, and speed.

A Pedersen commitment relies on publicly known generators, denoted

g_{0}, g_{1}, g_{2}, \dots

. Then, to compute the commitment to the sequence of data

d_{0}, d_{1}, d_{2}, \dots

, we compute:

C_{d a t a} = d_{0} g_{0} + d_{1} g_{2} + d_{2} g_{2} + \dots

Example

We will walk through an example where the client creates a table, appends to it, and then queries the table.

Table creation

The client sends the following data to the verifier to create a table with some data.

Example Employees table:

Weekly Pay	Yearly Bonus
4000	50000
7500	0
5000	400000
1500	0

The verifier accepts this data and computes the commitment to each column. In this case, these are:

\begin{aligned} C_{p a y} & = 2000 g_{0} + 7500 g_{1} + 5000 g_{2} + 1500 g_{3} \\ C_{b o n u s} & = 50000 g_{0} + 0 g_{2} + 400000 g_{2} + 0 g_{3} \end{aligned}

The verifier then stores these commitments and sends the data to the prover, who creates a new table in the database.

Table append

Suppose the client then sends the following new rows of data to the verifier.

Rows to append to Employees table:

Weekly Pay	Yearly Bonus
3000	100000
4500	30000

The verifier accepts this data and updates the commitment to each column. In this case, the new commitments are:

\begin{aligned} C_{p a y} & = 3000 g_{4} + 4500 g_{5} \\ C_{b o n u s} & = 100000 g_{4} + 30000 g_{5} \end{aligned}

The verifier then sends the new rows to the prover, who appends them to the table in the database.

SQL query

Finally, the client decides that it want to know the total compensation. So, it sends the following SQL query to the verifier.

Total compensation SQL query:

SELECT Pay*52+Bonus FROM Employees

The verifier routes the query to the prover who then executes the query by taking the Pay column, multiplying it by 52 and adding the Bonus column. This is the result, which is sent back to the verifier. Because this query is so simple, the prover does not need to send any additional proof to the verifier.

Employees table along with total compensation:

Weekly Pay	Yearly Bonus
4000	50000
7500	0
5000	400000
1500	0
3000	100000
4500	30000

The verifier receives this result column from the prover. To verify it, the verifier must mirror the provers computation by taking the Pay commitment, multiplying it by 52 and adding the Bonus commitment. This happens to be exactly the commitment [Most operations are not this simple, and for more complex operations, a proof is required as well] to what the result should be, which is:

C_{c o r r e c t r e s u l t} = 52 \cdot C_{p a y} + C_{b o n u s}

The verifier then also computes the commitment to the result that got sent from the prover. If the prover sent the correct result, this should be:

C_{s e n t r e s u l t} = 258000 g_{0} + 290000 g_{1} + 660000 g_{2} + 78000 g_{2} + 256000 g_{4} + 254000 g_{5}

If the prover sent the wrong result, this will be something else. Then, the verifier checks to see if:

C_{c o r r e c t r e s u l t} = C_{s e n t r e s u l t}

By combining all of the previous equations one can see that these two should be the same. If they are the same, the verifier sends the result to the client along with a flag that confirms that the result is correct. If the two values are not equal the verifier tells the client that there was an error.