Setup
If you are using a docker image (PostGIS/PostgreSQL) you will need to setup the environment in the following way (the docker images don’t have these installed):
CMD: Log into the container and install the required packages to build cargo:
podman exec -it spatialytics-postgis bash
apt-get update && apt-get install -y curl gcc libssl-dev pkg-config git postgresql-17 clang-16 postgresql-server-dev-17
add postgres user to sudoers
usermod -aG sudo postgres
Notes: postgresql-17: we need the latest so that cargo-pgrx runs clang-16 and postgresql-server-dev-17: needed to build the pg_parquet
SWITCH TO USER, ENTER USER HOME AND:
CMD: Follow pg_parquet installation from source (-s -- -y
answer yes for defaults):
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
CMD: Install cargo:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
You will see:
> curl https://sh.rustup.rs -sSf | sh -s -- -y
...
Rust is installed now. Great!
To get started you may need to restart your current shell.
This would reload your PATH environment variable to include
Cargo's bin directory ($HOME/.cargo/bin).
To configure your current shell, you need to source
the corresponding env file under $HOME/.cargo.
This is usually done by running one of the following (note the leading DOT):
. "$HOME/.cargo/env" # For sh/bash/zsh/ash/dash/pdksh
source "$HOME/.cargo/env.fish" # For fish
source "$HOME/.cargo/env.nu" # For nushell
CMD: You need to source:
source "$HOME/.cargo/env"
SHOULD BE DONE BEFORE
CMD: install gcc
, the headers for openssl and
apt-get install -y gcc libssl-dev pkg-config
NOTE: To install cargo-pgrx
, cargo will try to find openssl via: PKG_CONFIG_ALLOW_SYSTEM_CFLAGS=1 pkg-config --libs --cflags openssl
CMD: now we can install cargo-pgrx
Install locked like this
cargo install cargo-pgrx --version "0.13.1" --locked
From docs:
# install cargo-pgrx
> cargo install cargo-pgrx
# install this way until the issue is patched
# https://github.com/pgcentralfoundation/pgrx/issues/2009
# https://github.com/pgcentralfoundation/pgrx/issues/2016
CMD: configure pgrx
cargo pgrx init --pg17 $(which pg_config)
THIS MIGHT NOT BE NEEDED B/C WE’RE USER If running with root, I don’t think it will work (initdb cannot run as root user)
root@1d951fd1d999:/# cargo pgrx init --pg17 $(which pg_config)
Creating PGRX_HOME at `/root/.pgrx`
Validating /usr/bin/pg_config
Skipping initdb as current user is root user
CMD: append the extension to shared_preload_libraries
echo "shared_preload_libraries = 'pg_parquet'" >> ~/.pgrx/data-17/postgresql.conf
but in the Debian docker PostGIS image you should look where that conf is and use that (https://stackoverflow.com/a/3603162):
> psql -U postgres -c 'SHOW config_file'
config_file
------------------------------------------
/var/lib/postgresql/data/postgresql.conf
(1 row)
DONT THINK WE NEED SUDO B/C WE’VE ALREADY INSTALLED EVERYTHING CMD: install sudo:
> apt-get install -y sudo
Clone the repo:
git clone https://github.com/CrunchyData/pg_parquet.git && cd pg_parquet
Then:
# initialize a data directory, build and install the extension (to the targets specified by configured pg_config), then connects to a session
> cargo pgrx run
# alternatively you can only build and install the extension (pass --release flag for production binary)
> cargo pgrx install --release
# create the extension in the database
psql> "CREATE EXTENSION pg_parquet;"
Installing
Install pg_parquet
CREATE EXTENSION IF NOT EXISTS pg_parquet;
-- Import the Parquet file directly
COPY my_table FROM '/path/to/file.parquet' WITH (FORMAT 'parquet');