Load NYSE Stock Prices into SXT
This demo will guide you through loading an external dataset (NYSE Stock Data) into Space and Time.
This demo will illustrate how load an external dataset into a new Space and Time, including creating the table itself, defining biscuits, and a few different ways to execute the data insert. This is intended as a working demo and starting point for other similar data loads, so the structure is purposely simple. However, it can also be used as a quick-start guide for small projects.
While the code is entirely available on Github, you can also build all collateral by following along with the steps below. If you're looking to educate yourself on Space and Time, it's recommended you build along with the instructions below. If you want to jump into using the code as a quick-start, or are already comfortable with Space and Time, check out the github repo here:
Github Repo for this Demo: https://github.com/SxT-Community/demo_load_nyse
Summary
This guide will walk you through:
- Installing prerequisites
- Setting up required files and folders
- Creating your credential file
- Creating the code file
- Next steps to try
Prerequisites
The prerequisites for this demo are fairly basic - you'll need:
Valid Space and Time User_ID
Create a UserID:
The easiest way to create a new user is with Space and Time Studio, as it has wrapped all steps into a simple UI. To get started:
Congrats! You now have a Space and Time account!
Alternatively, you can create your user by "Connecting a wallet" immediately, however you lose the the option to login with a Password. It's generally recommended you create your account with a UserName / Password first, then connect that account with a wallet afterwards.
SXTCLI Installed
The SXTCLI is a great developer tool for interacting with the Space and Time network using scripts, or even one-time setup commands via the terminal. For purposes here, it's the easiest way to add a new ED25519 keypair to your UserID. If you already have an ED25519 keypair, you can skip down to setting up a subscription.
Install Java JDK
Java is fairly standard install on most development machines. To test if you have Java installed, open a terminal / CMD and enter java --version
which should return something like:
└─[$] java --version
java 20.0.1 2023-04-18
Java(TM) SE Runtime Environment (build 20.0.1+9-29)
Java HotSpot(TM) 64-Bit Server VM (build 20.0.1+9-29, mixed mode, sharing)
If the Java version is lower than 20.0.1
or is not installed (returns an error):
- Navigate to the JavaJDK page
- Select your OS / Chip Architecture
- Download the file named:
jdk-<latest_version>_*
- Follow install instructions per the installer for your operating system
- Once complete, confirm Java is installed by running
java --version
as per above
Install and Configure the SXTCLI
To install the SXTCLI, there are two steps: download the .jar file, and create a short-cut to the .jar file.
Download the .JAR File
This link to the SxTCLI will take you to a top-level directory with all CLI versions. We highly recommended that you always install the latest version.
Inside the version folder, you'll see two files: .jar and .pom. For most uses, you only need the .jar file, and can ignore the .pom file. Save the downloaded .jar file to the location where you want the application to run (for example, /applications/SXTCLI/
, /program files/SXTCLI/
, etc).
Once downloaded, you can test the SXTCLI by opening a terminal or command window, navigate to the folder where you saved the .jar, and enter:
java -jar sxtcli-<version>.jar version
You should see a menu returned, something like: Version: 0.0.6
Add CLI to your Shell Environment (optional)
While this step is technically optional, SXTCLI is significantly easier to use if you create an alias or shortcut to the .jar file, rather than always having to call java with a specific location.
Using your editor of choice, edit your shell profile:
Mac OSX | Linux | Windows WSL |
---|---|---|
~/.zshrc | ~/.bash_profile | ~/.bashrc |
Add this line to the bottom, replacing <PathToJar>
with the fully qualified path where you saved the downloaded jar file:
alias sxtcli='java -jar <PathToJar>/sxtcli-<LATEST_VERSION>.jar'
While you're editing your shell profile, consider the best practice of setting up a dotenv (.env) file at the same time!
To test your alias, open a new terminal and type: sxtcli version
If the alias is working correctly, you'll see the same answer as the above step: Version: 0.0.6
Congrats! You're ready to begin working with the SXTCLI!
If you're going to be using the SXTCLI extensively, check out this community project that creates additional shell functions to wrap and simplify the most common SXTCLI functions!
New ED25519 Keypair
To authenticate to the network using a public/private keypair, you'll first need to create a new keypair, and then use the SXTCLI to register it on the network.
Add an ED25519 keypair (for API calls)
Adding a new ED25519 keypair to your account requires authenticating once (to verify you are in control of the UserID) then registering the locally generated public key. We'll use Space and Time Studio to authenticate and provide the ACCESS_TOKEN, which proves we are in control of the account. Then we can use that ACCESS_TOKEN and the SXTCLI to validate our account and register a new keypair.
To add a new ED25519 keypair to your account:
-
Make sure the Space and Time CLI is installed.
-
Open a terminal / command window and enter:
sxtcli authenticate keypair
...which should produce an output similar to...ED25519 KeyPair (base64) Private key: ft2oz41+aXy1KtAyNzGOqxnHGbNSSXV8xpxN6FpN8w= Public key: eIMaJpSbLC6jyAG9RfwvklgsXbDc/HI2eH7TitL2pA=
Note: do not use the above keys - for proper security, please always generate your own new keys.
-
STOP NOW and save your new keys to a secure repository, such as an enterprise secrets manager or password manager. Like web3 wallets, lost keys cannot be recovered!
-
Navigate to Space and Time Studio, click on "Sign In" in the upper-right most corner, and log in with the same UserID which will receive the new keypair association.
-
Click on "My Account" in the upper-right (replaces "Sign In") and make sure you're on the "My Account" tab.
-
Scroll down to the section titled, "Quickly test out our API from Docs" and copy the "Access Token" show at the bottom:
- Return to your terminal window and use the SXTCLI to complete the registration process:
sxtcli authenticate keychain \ --accessToken="eyJ0eXBlIjoiYWNjZXNzIiwia2lkIjoiZTUxNDVkYmQtZGNmYi00ZjI4LTg3NzItZjVmNjN..." \ --url="https://api.spaceandtime.dev" \ add \ --privateKey="ft2oz41taXy1KtAyNzGOqxnHGbNSSXV8xpxN6FpN8w=" \ --publicKey="eIMaJpSbLC6jyAG9RfwvklgsXbDcdHI2eH7TitL2pA="
- Paste your ACCESS_TOKEN over the sample in line 2
- Paste your Private Key over the sample in line 5
- Paste your Public Key (wallet address) over the sample in line 6
- All other parameters can be left to default
If successful, you'll receive the message: key added to keychain
.
Congrats! You've successfully added a new keypair to your UserID's keychain!
You can add as many keypairs to your keychain as you'd like. For more information, check out this detailed article on connecting to the decentralized network.
To align with security best practices, Space and Time will never transmit your private key. Some tools like the SXTCLI will require your private key to cryptographically sign authentication challenge tokens, but that private key itself is never transmitted.
Valid Subscription (payment optional)
Most activities on Space and Time can be used with a trial user, however some things, like creating a table, require a subscription. That said, payment is not required to create a subscription.
Create a new subscription:
Creating a new subscription is extremely simple using Space and Time Studio! Having a subscription opens up certain features (like creating tables) that are unavailable without a subscription.
To create a new subscription:
Once you successfully create your subscription, the information on the right will be replaced with ways to manage your subscription / organization, and your name will lose it's "trial" tag.
Congrats! You now have a Space and Time subscription!
OR, join your organization's existing subscription:
If your organization already has an active Space and Time subscription that you'd like to join, your SxT admin must issue you a 'JoinCode'. The JoinCode they send you will look like a hexidecimal key, for example: 1115639b5baf85bba86153f8
You should see a green confirmation message in the upper right, and this section of the UI will be replaced with information on you current subscription.
Congrats! You've just joined a subscription!
The JoinCode expires 24 hours after generation and is consumed once used (for security) - so don't delay!
Python 3.10 or Higher
This demo uses the Space and Time Python SDK, which in turn requires Python 3.10 or higher. Check out the Python download page to get started!
This demo project's setup.sh
file will use pip
to install all dependencies, so no other installs are required (beyond NOT opting-out of pip). If you're curious, you can find the SXT Python SDK on pypi.org.
Setting Up Files/Folders
Like any project, we'll need a few structures to ensure we're separating code from credentials. If you're going to push to github, we'll also want to ensure a valid .gitignore
file. The script below will create a few empty files for us to edit, create and configure a python virtual environment, and perhaps more importantly, make sure we're not accidentally committing keys!
- Navigate to a location of your choice on your computer, and create a new, empty folder for your project, named something like
./demo_load_nyse
(referred to below as the 'project folder'). - Execute the script below. You can do this by either
- Opening a terminal window and copy/paste in the commands below, in order, or
- Create a
setup.sh
file, copy/paste the commands below into that file, save, and execute that file.
# !/bin/bash
# create python script file placeholder:
mkdir ./src
touch ./src/load_nyse.py
# create environment variables (.env) file placeholder:
mkdir ./.env
touch ./.env/secrets.env
# setup .gitignore file, with a few critical entries:
touch ./.gitignore
echo 'secrets.env' > ./.gitignore
echo 'venv_demo/' >> ./.gitignore
echo 'tables/' >> ./.gitignore
# create python virtual environment:
python3 -m venv venv_demo
source venv_demo/bin/activate
pip3 install --upgrade pip
pip3 install spaceandtime --upgrade
pip3 install yfinance --upgrade
deactivate
echo "Done! "
If done correctly, your project folder should now contain:
./src/
folder./src/load_nyse.py
empty file./.env/
folder./.env/secrets.env
empty file./.gitignore
file with 3 lines of text./venv_demo/
folder, containing the python virtual environment (with dependencies installed)
Some operating systems (like MacOS) will hide files and folders that start with a period by default. This means you may only see
src
andvenv_demo
folders. From a terminal you cancd
to your folder and typels -a
to see -all of the files and folders, including those hidden.Or, change your operating system's file navigator to show hidden files.
For example, on MacOS press [ cmd + shift + "." ] keys together to toggle viewing hidden files.
You only need to run the above script once - however it is safe to run many times. In fact, if you add to the bottom:
# run the python file
python3 ./src/load_nyse.py
Then it can also be used to update the environment and start your script, good for schedulers like Cron.
Credential File: secrets.env
secrets.env
Now that we have all the structures we need created, let's collect and save all our credentials into the newly created secrets.env
file. This allows for separation between code and credentials, and helps keep your secrets secret!
Let's start with a template - open up the secrets.env
file in your favorite IDE or text editor, and paste this content:
# This is a sample .env file, with sample secrets.
# Secrets below are designed to look real, but WILL NOT WORK.
# For help on creating your own SXT credentials, please see:
# https://docs.spaceandtime.io/docs/newuser
# Space and Time User Login Keys (Base64):
SXT_USER_ID="your_user_id"
SXT_USER_PUBLIC_KEY="eCHWo+ELB47xmp94aeZ+Cp9SAoh0w2PH3bhKzX0NNB0="
SXT_USER_PRIVATE_KEY="UELmpzmTgxLU+bfZkOIOJknRTO0eonkKjFNh0mOg5W8="
# Space and Time Table Keys (Hex):
RESOURCE_PUBLIC_KEY="347B389DE73D46BFB64C068A7D9C01142599E7BBBCCC4FABF404E134DDBA939"
RESOURCE_PRIVATE_KEY="CD3C070BD746439399BBE2590F77CB0C556419B00ED6D8D798D0C4317A4E7B2"
As noted in the text comment, the above keys are designed to LOOK right, but by design will not work. We'll need to replace these with your own keys.
There are two sets of keys here:
- SXTUSER* keys, which are needed to authenticate your UserID to the Space and Time network
- RESOURCE_* keys, which are used to prove ownership over your table or tables.
Both keys are ED25519 signatures, but the SXT_USER authorization requires Base64 encoding, while the RESOURCE requires Hex encoding. This is because the RESOURCE_PUBLIC_KEY is added to any DB object during creation, meaning it has to use SQL-Safe" characters, whereas SXT_USER keys prefer the denser format of Base64 encoding for shorter (and more frequent) messages.
Also, people will sometimes get USER and RESOURCE keys mixed up - having different encodings help bring visual clarity to which is which.
USER Keypair
If you completed all prerequisite steps, you should already have an ED25519 keypair associated with your UserID. If you missed that step, or just want to add a new keypair (there is no limit), check out:
Add an ED25519 keypair to your UserID
Adding a new ED25519 keypair to your account requires authenticating once (to verify you are in control of the UserID) then registering the locally generated public key. We'll use Space and Time Studio to authenticate and provide the ACCESS_TOKEN, which proves we are in control of the account. Then we can use that ACCESS_TOKEN and the SXTCLI to validate our account and register a new keypair.
To add a new ED25519 keypair to your account:
-
Make sure the Space and Time CLI is installed.
-
Open a terminal / command window and enter:
sxtcli authenticate keypair
...which should produce an output similar to...ED25519 KeyPair (base64) Private key: ft2oz41+aXy1KtAyNzGOqxnHGbNSSXV8xpxN6FpN8w= Public key: eIMaJpSbLC6jyAG9RfwvklgsXbDc/HI2eH7TitL2pA=
Note: do not use the above keys - for proper security, please always generate your own new keys.
-
STOP NOW and save your new keys to a secure repository, such as an enterprise secrets manager or password manager. Like web3 wallets, lost keys cannot be recovered!
-
Navigate to Space and Time Studio, click on "Sign In" in the upper-right most corner, and log in with the same UserID which will receive the new keypair association.
-
Click on "My Account" in the upper-right (replaces "Sign In") and make sure you're on the "My Account" tab.
-
Scroll down to the section titled, "Quickly test out our API from Docs" and copy the "Access Token" show at the bottom:
- Return to your terminal window and use the SXTCLI to complete the registration process:
sxtcli authenticate keychain \ --accessToken="eyJ0eXBlIjoiYWNjZXNzIiwia2lkIjoiZTUxNDVkYmQtZGNmYi00ZjI4LTg3NzItZjVmNjN..." \ --url="https://api.spaceandtime.dev" \ add \ --privateKey="ft2oz41taXy1KtAyNzGOqxnHGbNSSXV8xpxN6FpN8w=" \ --publicKey="eIMaJpSbLC6jyAG9RfwvklgsXbDcdHI2eH7TitL2pA="
- Paste your ACCESS_TOKEN over the sample in line 2
- Paste your Private Key over the sample in line 5
- Paste your Public Key (wallet address) over the sample in line 6
- All other parameters can be left to default
If successful, you'll receive the message: key added to keychain
.
Congrats! You've successfully added a new keypair to your UserID's keychain!
You can add as many keypairs to your keychain as you'd like. For more information, check out this detailed article on connecting to the decentralized network.
To align with security best practices, Space and Time will never transmit your private key. Some tools like the SXTCLI will require your private key to cryptographically sign authentication challenge tokens, but that private key itself is never transmitted.
Once you have your USER keypair, edit your secrets.env
file and plug in your SXT User_ID, Public Key and Private Key.
# Space and Time User Login Keys (Base64):
SXT_USER_ID="your_user_id"
SXT_USER_PUBLIC_KEY="your_new_public_key"
SXT_USER_PRIVATE_KEY="your_new_private_key"
Save your credentials somewhere safe! Using a password manager or enterprise secrets management software is best practice. Like any Web3 project, lost keys means losing control of your account.
RESOURCE Keypair
All resources (tables, views, etc.) on the Space and Time network must be created with a public/private keypair. The public key is added to the CREATE TABLE statement and saved on the network. The private key is used to sign self-describing authorization keys (aka biscuits) which authorize any request to that table.
This allows authorization to be decentralized, while allowing the owner of the table (aka the person holding the private key) to define and share signed biscuits that the table can verify as authentic. The self-describing nature of biscuits also allow that authorization to be limited to a particular table, person, time period, or subscription.
To get started quickly, you can use the SXTCLI to generate a new table keypair AND an admin-level biscuit, all with one command: Open up a terminal window and enter the command:
sxtcli biscuit generate wildcard
The above command will return:
- An ED25519 keypair (hex encoded) that can be used for creating tables, views, etc.
- A description of the biscuit capabilities, in this case:
- for ANY resource ("*") with a matching public key,
- allow ANY command ("*") - hence the name "wildcard" or admin biscuit
- The actual biscuit key itself
This output will be displayed as:
Biscuit KeyPair (hex)
Private key: D4EAC895937773E3047FE2E9AFA7004B80601BA0F687511BC23658D366FFB51B
Public key: 51297647296389ADABBB6E62230203C620DF47BF8DCAEEF1B1EC35507FBA48E0
Biscuit content:
Facts:
sxt:capability("*","*");
Biscuit:
EpABCiYKDnN4dDpjYXBhYmlsaXR5CgEqGAMiDwoNCIAIEgMYgQgSAxiBCBIkCAASIJK...
Note, the below key was modified to be invalid -- please generate your own fresh key!
Both Users and Tables / Views use the ED25519 keypair for security. However, there is a difference in encoding between the two:
- User authentication will expect Base64 encoding
- Tables and Views, which need it embedded in SQL, will expect Hex encoding
The CLI will return the correct encoding when generating new keys.
Store the private key somewhere safe! If you lose the private key, you lose control of the table, view, or resource!
Multiple tables or views can share a single keypair - for example, a User
table and a User_Settings
table may share the same keypair (and so wildcard biscuit), since they are related and often operated on together.
That said, it is NOT recommended to reuse keys too widely, as it broadens the exposure of any stolen or lost keys.
Congrats! You have created a new resource public/private keypair!
Add your newly created RESOURCE keys to your credential file, and save!
# Space and Time Table Keys (Hex):
RESOURCE_PUBLIC_KEY="your_new_resource_public_key"
RESOURCE_PRIVATE_KEY="your_new_resource_private_key"
Code File: load_nyse.py
load_nyse.py
For this demo, we'll use the Space and Time Python SDK. The above setup.sh
script will have created a python virtual environment, and pip installed the latest version of all requirements. That said, you can always install libraries manually by opening a terminal, making sure to activate the python virtual environment with: source venv_demo/bin/activate
, then enter pip3 install spaceandtime
.
If you get errors using
pip3
during installs, or the install runs but the program doesn't recognize the SpaceAndTime package, check the version of Python thatpip3
in installing to.
How to check the install location for pip and pip3
Both pip
and pip3
are links to the pip installer, and for most modern OS / python installations, these will both point to the same python installer. If you're on an older OS, or have a non-standard python install, it's possible they'll point to different locations.
To see the exact location each points to, you can run:
ls -l `which pip`
ls -l `which pip3`
...which will return output something like:
-rwxr-xr-x 1 <your user and role> 267 Jun 24 12:06 /Library/Frameworks/Python.framework/Versions/3.11/bin/pip
-rwxr-xr-x 1 <your user and role> 267 Jun 24 12:06 /Library/Frameworks/Python.framework/Versions/3.11/bin/pip3
In this case, we can see both point to the same version of python, and can be used interchangeably.
Python Imports
The python script will start with all imports at the top, as per standard practice. Copy/paste the below into your empty load_nyse.py
script file:
from spaceandtime import SpaceAndTime, SXTTable
from datetime import datetime, timedelta
import yfinance, pandas, os, re
The two imports you may not recognize:
- spaceandtime - the Space and Time network Python SDK, from which we're importing:
- SpaceAndTime - is a top-level convenience object that aggregates many of the lower-level objects
- SXTTable - allows control over a DB table in the Space and Time Database
- yfinance - the Yahoo! Finance SDK, for pulling stock (and crypto) price data
Authenticate to Space and Time Network
Next, we'll create a SpaceAndTime
object and use it to authenticate to the network, using our prepared credential file. Copy / paste the below code into the bottom of your load_nyse.py
file:
# connect to the network, and authenticate (must have an .env file)
sxt = SpaceAndTime(envfile_filepath='./.env/secrets.env')
sxt.authenticate()
It's that easy! The SXT object will read in your secrets file, and use it to authenticate to the network. You should see log stream display everything it read from your credential (aka dotenv) file, followed by:
2024-11-13_16:58:22 INFO SXT User instantiated: sxtdemo_user
2024-11-13_16:58:23 INFO Authentication Success: True
Congrats! You just logged into the Space and Time network!
Create your Table
Creating a table on Space and Time takes a few simple steps, which we'll break into different sections so we can describe as we go. You can copy each of the code sections into the bottom of your load_nyse.py
script as you progress:
Create the SXTTable Object
The first step is to create a new SXTTable object, supplying some initial parameters:
# create table object:
stocks = SXTTable(name = "SXTDemo.Stocks_some_unique_name",
private_key = os.getenv('RESOURCE_PRIVATE_KEY'),
access_type = sxt.TABLE_ACCESS.PUBLIC_READ,
SpaceAndTime_parent=sxt)
On the first code line, we're defining the SXTTable object, defined as the variable stocks
. The SXTTable object has many options, we're just setting the most common above:
- name = the full SCHEMA.TABLE_NAME as a DB compliant string (alpha-numeric and underscore only).
SCHEMA.TableNames are globally unique, so you'll likely need to append some unique suffix to the end of the string, like your UserID. If you download the github repo, you'll notice it automatically appends your UserID and does a regex substitution of any non-DB safe characters (since UserIDs can have non-DB save characters). If you want to emulate that here, simply replace the first code line above with:stocks = SXTTable(name = f"SXTDemo.Stocks_{re.sub(r'[^a-zA-Z0-9_]', '_', sxt.user.user_id)}",
- private_key = the RESOURCE private key from your credential file.
- access_type = the default access provided, if no authorization biscuit is supplied.
In this case, we're setting to PUBLC_READ, meaning anyone can read the table, but only you can modify. Check out this page for the entire list of table options. - SpaceAndTime_parent = this allows the SXTTable object to inherit settings from the above SpaceAndTime object, such as logging, description names, and most importantly, the authenticated default UserID - which allows the table to submit requests to the Space and Time Database directly.
Per the last parameter - only SXTUser objects can authenticate to Space and Time and submit queries. The SpaceAndTime convenience object contains a default user object, making it easy to get started quickly; two lines of code to authenticate!
Tables cannot submit queries to the network. However, by providing an authenticated user to the SXTTable object, the table itself can now run queries on the network using the user's connection - check out functions like stocks.select()
or stocks.delete()
or stocks.insert.with_list_of_dicts
. Thus, setting the SpaceAndTime_parent = sxt
allows the table object to inherit the default user's connection and manage its own state.
If you want to play around with the default user, you can always access from the sxt object: print(sxt.user)
Define the CREATE TABLE SQL
Next, let's set the CREATE TABLE query - if you've ever created a database table before, this will look very familiar:
stocks.create_ddl = """
CREATE TABLE {table_name}
( Symbol VARCHAR
,Stock_Date DATE
,Stock_Open DECIMAL
,Stock_High DECIMAL
,Stock_Low DECIMAL
,Stock_Close DECIMAL
,Stock_AdjClose DECIMAL
,Stock_Volume BigInt
,PRIMARY KEY (Symbol,Stock_Date)
) {with}
"""
There are two substitutions here:
- {table_name} = is substituted with the
stocks.table_name
that was set during object creation - {with} = The
WITH
statement is the single non-standard extension required by the Space and Time Database.
After setting the stocks.create_ddl
above, you can turn around and print(stocks.create_ddl)
to see the SQL being sent to the database:
CREATE TABLE SXTDemo.Stocks_sxtdemo_user
( Symbol VARCHAR
,Stock_Date DATE
,Stock_Open DECIMAL
,Stock_High DECIMAL
,Stock_Low DECIMAL
,Stock_Close DECIMAL
,Stock_AdjClose DECIMAL
,Stock_Volume BigInt
,PRIMARY KEY (Symbol,Stock_Date)
) WITH "public_key=85abf..., access_type=public_read"
You'll note the two parameters were automatically substituted, most notably, the {with}
expanded to include the full WITH "public_key=<your_RESOURCE_public_key>, access_type=public_read"
which should reflect our definition above.
Also note the second-to-last line, ,PRIMARY KEY (Symbol,Stock_Date)
. The Primary Key of a table tells the Space and Time Database how to uniquely identify any given row. this means that the combination of Symbol and Stock_Date must be unique. Inserting two rows with the same Symbol and Stock_Date will cause the insert to fail, with a "unique key violation" error.
Create Authorization Biscuits
If the SXTTable has a private_key set (like we supplied above), it can generate its own authorization biscuits. You will need at least one ADMIN biscuit that has CREATE TABLE permission, with the others being a best practice, but optional:
# create two permissions (biscuits), one Admin, and one just to Load
stocks.add_biscuit('Admin', sxt.GRANT.ALL)
stocks.add_biscuit('Load', sxt.GRANT.INSERT, sxt.GRANT.UPDATE,
sxt.GRANT.DELETE, sxt.GRANT.SELECT)
stocks.add_biscuit('Read', sxt.GRANT.SELECT)
These biscuits are saved by the SXTTable object and applied automatically to any query that the SXTTable object runs. If you get an Authorization Error, check that (a) stocks.private_key
is valid, and stocks.biscuits
contains at least one Admin biscuit.
What is a biscuit?
Biscuits are self-describing authorizations that can be verified by the network, and used to share access to resources (tables, views, etc.) on the Space and Time network.
Biscuits are created locally (client) using the Private Key of the database object (table or view), which is then verified on the network (server) with the object's Public key.
The process of creating a biscuit looks like:
-
Create an ED25519 public/private keypair
-
Create the desired object, with the Public key from #1 added to the
WITH
statement (required) -
Submit the final
CREATE...
statement to the Space and Time network to be created, again including the Public key. -
With the same Private key from #1, create a biscuit with the appropriate constraints and authorizations.
-
The user submits a query to the network, and includes a biscuit.
-
The table or view verifies the biscuit instructions, filters and capabilities using it's Public key, which must match the corresponding Private key from step #4
The ED25519 keypair created for tables and views are encoded in HEX, whereas the ED25519 keypair created for Users are encoded as BASE64. This is done to reinforce the fact that table keypairs and user keypairs are different. Also, HEX is SQL-Safe (needed to embed in CREATE statement), whereas Base64 is not.
For our use-case here, Python is managing the creating of biscuits, and attaching them to requests.
Save your Table Definition to a File
This next step is deceptively important: save all settings for the stocks
table to a uniquely named file, under a ./tables/
subfolder:
# save all table settings to file
stocks.save(stocks.recommended_filename)
This file will include the SQL statement, the public and private keys, and all biscuits - everything needed to recreate the object. Instead of serializing to a .pickle or binary file, this saves to a shell friendly format, meaning you can load the file into a terminal's environment variables and continue interacting with the table using the SXTCLI or other OS scripting tools.
Most importantly, the recommended_filename
property will create a timestamped file, meaning it automatically captures a snapshop of changes over time (although you can provide any name you'd like).
Create the Table on the Network
Up to this point, everything we've done to configure the stocks
table has been made locally - nothing has been sent to the Space and Time network yet. Time to make it real!
Copy this command into your file to (a) check if the table already exists on the network, and if not, (b) create it!
# actually create, if missing:
if not stocks.exists: stocks.create()
This will create the table in the network! Note that you can run this script repeatedly - if the table exists the second time you run it, the create will be skipped.
Congrats! You've created your
stocks
table on the Space and Time network!
Put it All Together
Let's put this all together into one script - your load_nyse.py
script should now look something like:
from spaceandtime import SpaceAndTime, SXTTable
from datetime import datetime, timedelta
import yfinance, pandas, os, re
# connect to the network, and authenticate (must have an .env file)
sxt = SpaceAndTime(envfile_filepath='./.env/secrets.env')
sxt.authenticate()
# create table object (appending your userid to the end of tablename):
stocks = SXTTable(name = f"SXTDemo.Stocks_{re.sub(r'[^a-zA-Z0-9_]', '_', sxt.user.user_id)}",
private_key = os.getenv('RESOURCE_PRIVATE_KEY'),
access_type = sxt.TABLE_ACCESS.PUBLIC_READ,
SpaceAndTime_parent=sxt)
stocks.create_ddl = """
CREATE TABLE {table_name}
( Symbol VARCHAR
,Stock_Date DATE
,Stock_Open DECIMAL
,Stock_High DECIMAL
,Stock_Low DECIMAL
,Stock_Close DECIMAL
,Stock_AdjClose DECIMAL
,Stock_Volume BigInt
,PRIMARY KEY (Symbol,Stock_Date)
) {with}
"""
# create three permissions (biscuits), one Admin, one for Reading, and one just to Load
stocks.add_biscuit('Admin', sxt.GRANT.ALL)
stocks.add_biscuit('Load', sxt.GRANT.INSERT, sxt.GRANT.UPDATE,
sxt.GRANT.DELETE, sxt.GRANT.SELECT)
stocks.add_biscuit('Read', sxt.GRANT.SELECT)
# save all table settings to file
stocks.save(stocks.recommended_filename)
# actually create, if missing:
if not stocks.exists: stocks.create()
Troubleshooting
Sometimes things go wrong! Here are a few common issues / mistakes:
- Authentication only lasts for 25 minutes before needing to be refreshed - if you're really exploring each step above, your ACCESS_TOKEN may have timed out.
Resolve: Simply re-authentication with:sxt.authenticate()
- If you get an "Authorization" error, it means something was wrong with your biscuit, or the private key used to create the biscuit. Or, you created the table with one keypair, then created the biscuit with a different keypair. Remember that the same private key must create both the Public Key assigned to the table at CREATE time, as well as all biscuits.
Resolve: Check that your keys are being passed in from your credential file correctly, and that the table's private key and biscuit's private key are the same - i.e,. the private key listed in both objects should match:
stocks.get_biscuit('Admin')
should matchstocks.private_key
- If you get a "table already exists" error, then your SCHEMA.TableName is not globally unique - someone beat you to it!
Resolve: Change your tablename to be more unique, and try again.
Assuming you're authenticated, you can test if a SCHEMA.TableName exists with:stocks.table_name = "SXTDemo.Stocks_more_unique_suffix" stocks.exists False
- Errors stating that your table is in the "Public" schema are caused by forgetting to assign a schema.
Resolve: Make sure to include a valid schema - above we've been usingSXTDemo.Stocks_some_suffix
, whereSXTDemo
is the schema andStocks_some_suffix
is the tablename. - Likewise, Scheme Doesn't Exist error indicates you have a typo in your schema name, or changed it to a schema that does not exist.
Resolve: Change your schema to something that exists, likeSXTDemo
(created specifically for demos), or create your own new schema by issuing the SQL command:CREATE SCHEMA SchemaName
Pull API Data and Load Table
This next section includes more pure-Python components, pulling stock data from the yfinance
SDK, transforming the resulting Pandas dataframe into a Python list-of-dictionary objects (aka rows and columns), which can natively be inserted into Space and Time using the SDK.
Copy this larger block of text and paste it at the bottom of your load_nyse.py
script:
# ----------------------------------------
# ------- PULL AND LOAD DATA -------------
# ----------------------------------------
# define start/end dates (7 rolling days)
end_date = datetime.now().strftime('%Y-%m-%d')
start_date = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')
for symbol in ['AAPL', 'GOOGL', 'AMZN', 'MSFT', 'RIVN', 'NVDA',
'BTC-USD','ETH-USD','SOL-USD','ADA-USD', 'AVAX-USD']:
sxt.logger.info(f'Processing: {symbol} between {start_date} and {end_date}')
# download price data
data = yfinance.download(symbol, start=start_date, end=end_date)
if data.empty or data is None:
print(f"No new data detected for {symbol}")
else:
# transform pandas dataframe to list of dicts
data = pandas.DataFrame(data).reset_index() # add date (from index)
data.insert(loc=0, column='Symbol', value=symbol) # add symbol to front
data.columns = ['Symbol', 'Stock_Date', 'Stock_Open', 'Stock_High', 'Stock_Low',
'Stock_Close', 'Stock_AdjClose', 'Stock_Volume'] # rename to table column names
data['Stock_Date'] = data['Stock_Date'].dt.strftime('%Y-%m-%d') # convert date to string
data_load = data.to_dict(orient='records') # convert to list of dicts
# delete any pre-existing data, before inserting:
stocks.delete(where = f"""
Symbol = '{symbol}'
AND Stock_Date between '{start_date}' AND '{end_date}' """)
# insert into SXT
success, response = stocks.insert.with_list_of_dicts(data_load)
if success:
sxt.logger.info('SUCCESSFUL INSERT!')
else:
sxt.logger.error(f'ERROR ON INSERT: {response}')
This process does the following, in order from top-to-bottom:
- Define a date range, using
start_date
andend_date
. - Loop through a list of stock
symbol
s, for both stocks and crypto prices. - Call the
yfinance.download
function to return data for the symbol, for the date range. If the call returns no data, log and skip to the next symbol. - The
data
object returned is a Pandas dataframe - the next 6 lines of code manipulate the dataframe into the same structure as our table, then turns it into a list-of-dictionaries. - Delete any records in our Space and Time table that might already exist for that stock Symbol and Date, using the SXTObject build-in
stocks.delete()
function.
This step is a good example of the table managing itself - we're simply providing the WHERE statement, the object is managing the creation of the rest of the SQL, as well as execution. It's also worth noting that, to protect the table from accidental deletion, thewhere
parameter is required (if you want to delete ALL records, supply something like1=1
). - Use the SXTTable Objects built in
stocks.insert.with_list_of_dicts()
function, which iterates through a list (rows) of dictionary objects (columns) and inserts them into the table.
Each insert is crafted based on the names found in each dictionary, meaning every row can contain a different number of columns, and the insert will still work. This makes this function very flexible and save, but not the fastest, since every row is individually loaded. - Log success or failure of the insert, then back to the top of the loop for the next stock symbol
Save your load_nyse.py
file and give it a run! You should see a series of log entries fly by in the terminal, showing blocks of activity like below, one for each symbol:
2024-11-1320:52:41 INFO Processing: BTC-USD between 2024-11-06 and 2024-11-13
[*100%***] 1 of 1 completed
2024-11-13_20:52:41 INFO DELETING: DELETE FROM SXTDemo.Stocks WHERE
Symbol = 'BTC-USD'
AND StockDate between '2024-11-06' AND '2024-11-13'
2024-11-13_20:52:43 INFO INSERT 7 rows into SXTDemo.Stocks...
2024-11-1320:52:43 INFO SXTDemo.Stocks Inserted Row 1 of 7 (14%) - Successes: 1 Erred: 0
2024-11-1320:52:43 INFO SXTDemo.Stocks Inserted Row 2 of 7 (29%) - Successes: 2 Erred: 0
2024-11-1320:52:43 INFO SXTDemo.Stocks Inserted Row 3 of 7 (43%) - Successes: 3 Erred: 0
2024-11-1320:52:43 INFO SXTDemo.Stocks Inserted Row 4 of 7 (57%) - Successes: 4 Erred: 0
2024-11-1320:52:43 INFO SXTDemo.Stocks Inserted Row 5 of 7 (71%) - Successes: 5 Erred: 0
2024-11-1320:52:44 INFO SXTDemo.Stocks Inserted Row 6 of 7 (86%) - Successes: 6 Erred: 0
2024-11-1320:52:44 INFO SXTDemo.Stocks Inserted Row 7 of 7 (100%) - Successes: 7 Erred: 0
2024-11-1320:52:44 INFO INSERT into SXTDemo.Stocks complete - Total Rows: 7, Successes: 7, Erred: 0
2024-11-13_20:52:44 INFO SUCCESSFUL INSERT!
Congrats! You successfully created a program to load Stock API data into Space And Time!
Run a Query on Your New Data
This last section simply demonstrates how to run a query on the data you just completed running above. Copy this and paste it at the end of your load_nyse.py
file!
# Find the overall highest and lowest price for the time period
success, data = stocks.select(f"""
select distinct Symbol
, min( stock_low ) over(partition by Symbol) as period_lowest_price
, max( stock_high ) over(partition by Symbol) as period_highest_price
from {stocks.table_name}
where Stock_Date between '{start_date}' and '{end_date}'
order by Symbol """)
if success: print('\n'.join([str(r) for r in data]))
print('\n\nDone!\n\n')
This query aggregates to the absolute lowest and highest observed price during the time period, and prints the result to the terminal.
Congrats! You've run a SELECT query on your newly loaded data!
That's it! In less than 100 lines of code, you have
- Authenticated into Space and Time
- Created a table and several permission sets
- Repeatedly connected to an outside AP to collect data
- Loaded that data into your new table
- Ran an aggregation query and displayed the results!
Nicely Done!!!
You've successfully loaded data from an API into the Space and Time network!
You're ready for big things!
Next Steps / Other Things to Try
Query Your Table From SXT Studio
Because we created our table with an access_type=Public_Read
, there are no special biscuits required to SELECT from the data in SXT Studio. Try logging in, running a query on your new table, and even generate visualizations!
Add Authorization to SXT Studio
That said - you will NOT have permissions to modify the table without authorization. You can try to run the query:
DELETE FROM SXTDemo.Stocks_your_table_name
and will get an authorization error (unless you've previously loaded biscuits). You can fix this by adding biscuits to the SXT Studio, by going to My Account
under Permissions and Keys
:
You'll need both the public key and Admin biscuit -- fortunately you saved those to a file in your script above!
Speed Up Inserts
As mentioned above, the insert method we selected is VERY flexible, however, is also fairly slow. Since we have a fixed structure to the data, we can change the code above, to create the INSERT statement with only ONE column definition for many rows of inserts. While we're only inserting a few rows in the above example, this approach can speed up loads by many orders-of-magnitude.
Try switching out the stocks.insert.with_list_of_dicts()
with the stocks.insert.with_sqltext()
:
# insert into SXT
# success, response = stocks.insert.with_list_of_dicts(data_load)
insert_head = f'INSERT INTO {stocks.table_name} ( {", ".join(list(data.columns))} ) VALUES \n '
insert_body = '\n,'.join(['('+', '.join([f"'{c}'" for c in list(r.values())])+')' for r in data_load])
success, response = stocks.insert.with_sqltext(insert_head + insert_body)
if success:
sxt.logger.info('SUCCESSFUL INSERT!')
else:
sxt.logger.error(f'ERROR ON INSERT: {response}')
If you want to really see the speed increase, change the number of days from 7 to 100!
Try Another Data Source
This same approach will work with many data sources - CSV files, streams, other APIs, etc. See if you can interrogate the structure of a CSV file, create a table, and load that table - in less than 200 lines of code!
And if you get working, be sure to contribute it back to the SXT-Community on Github!
We ❤️ contributors!
Updated 2 months ago