Author: Francis NdunguLast Updated: Wed, Jan 4, 2023
A Comma Separated Values (CSV) file is a plain-text file that stores data in tabular format. Usually, CSV files share data between spreadsheet applications and database systems. CSV files separate field values and records using commas and new line characters. When implementing PostgreSQL databases, you can easily import data from CSV files using the
\COPY command. The
\COPY command is available from the
psql package. This guide shows you how to copy CSV data from Ubuntu 20.04 server to a managed PostgreSQL database cluster.
To proceed with this guide:
Log in to your Vultr account. Navigate to Products, then Databases. Click the name of the PostgreSQL database cluster under the Managed Database Name. Navigate to the Overview tab and locate Connection Details. This guide uses the following sample connection details:
In this guide, you need to create a sample CSV file. When working in a production environment, you may generate the CSV file automatically from spreadsheet applications or other database systems. Follow the steps below to create the sample file:
SSH to your server and open a new
products.csv file on a text editor.
$ nano products.csv
Enter the following information into the
product_id,product_name,retail_price,available,created_on 1,20L FOUNTAIN MINERAL WATER,5.25,Y,2022-12-12 2,QUARTZ WALL CLOCK,30.20,N,2022-12-10 3,RECHARGEABLE LED LAMP,14.40,Y,2022-12-08 4,TYPE C 30W CHARGER,29.95,Y,2022-12-10 5,DOUBLE SIDED TAPE,3.45,N,2022-12-12
Save and close the
After setting up the CSV file, proceed to the next step to copy data into a PostgreSQL database cluster.
To import data into a managed PostgreSQL database cluster from CSV, you need to follow the steps below:
Log in to a managed PostgreSQL database cluster.
Create a database.
Create a table structure on the target PostgreSQL database that matches the data types of the CSV file fields.
Use the PostgreSQL server
\COPY command to load CSV data to the table.
postgresql-client package. This package allows you to use the PostgreSQL
psql command from the Linux server.
$ sudo apt update $ sudo apt install -y postgresql-client
Run the following
psql command to log in to the managed PostgreSQL cluster. Replace the
-p (port), and
-U (username) values with the correct credentials.
$ psql -h SAMPLE_POSTGRESQL_DB_HOST_STRING.vultrdb.com -p 16751 -U vultradmin defaultdb
Enter the password for the managed database cluster and press ENTER to proceed.
Create a sample
defaultdb=> CREATE DATABASE xyz_db;
Connect to the new
defaultdb=> \c xyz_db;
psql ... ... You are now connected to database "xyz_db" as user "vultradmin".
After creating a database, initialize the table structure in the next step.
After setting up and connecting to the target database, follow the steps below to create a table:
products.csv fields' values with the correct PostgreSQL data types as illustrated in the following list.
+---------------+---------------------+ | Field Name | Data Type | +---------------+---------------------+ | product_id | INTEGER | | product_name | VARCHAR(50) | | retail_price | FLOAT | | available | CHAR(1) | | created_on | DATE | +---------------+---------------------+
Use the PostgreSQL
CREATE TABLE command to set up a
xyz_db=> CREATE TABLE products ( product_id INTEGER PRIMARY KEY, product_name VARCHAR(50), retail_price FLOAT, available CHAR(1), created_on DATE );
Run the following describe statement command against the
products table to ensure you've got the correct schema.
xyz_db=> \d products;
Column | Type | Collation | Nullable | Default --------------+-----------------------+-----------+----------+--------- product_id | integer | | not null | product_name | character varying(50) | | | retail_price | double precision | | | available | character(1) | | | created_on | date | | | Indexes: "products_pkey" PRIMARY KEY, btree (product_id)
You have now set up a target database and a table for the data. Follow the next step to load data into the database.
\COPYCommand to Load Data
The PostgreSQL server
\COPY command follows the syntax below:
=> \COPY SAMPLE_TABLE (SAMPLE_COMMA_SEPARATED_COLUMNS) FROM 'SAMPLE_ABSOULTE_FILE_PATH' WITH (SAMPLE_OPTIONS);
\COPY syntax explained:
The command starts with a
SAMPLE_TABLE is the name of the target table where you want to place the data (For example,
(SAMPLE_COMMA_SEPARATED_COLUMNS) is an optional parameter that defines the order of the columns in the target table.
FROM 'SAMPLE_ABSOULTE_FILE_PATH' statement defines the absolute path where you have placed your CSV file (For example,
WITH (SAMPLE_OPTIONS) keyword defines the options that the PostgreSQL server should consider when executing the
\COPY command as illustrated below:
FORMAT: This is the input data format (For example,
DELIMITER: You should define the character separating the field values after the
DELIMITER keyword (For example,
HEADER: This directive tells the PostgreSQL server whether the raw input CSV file contains a header. If you set this value to true (
HEADER true), PostgreSQL discards the first line when loading data. However, if your CSV file doesn't contain any header, set this value to
false using the
HEADER false statement to load everything in the file.
After understanding the PostgreSQL
\COPY syntax, follow the steps below to load and verify the CSV data.
Run the following command to copy data from the '~/products.csv' file into the
xyz_db=> \COPY products (product_id, product_name, retail_price, available, created_on) FROM '~/products.csv' WITH (FORMAT CSV, DELIMITER ',', HEADER true);
products table to ensure the data is in place.
xyz_db=> SELECT product_id product_name, retail_price, available, created_on FROM products;
product_name | retail_price | available | created_on --------------+--------------+-----------+------------ 1 | 5.25 | Y | 2022-12-12 2 | 30.2 | N | 2022-12-10 3 | 14.4 | Y | 2022-12-08 4 | 29.95 | Y | 2022-12-10 5 | 3.45 | N | 2022-12-12 (5 rows)
The above output confirms that you've successfully loaded data from your CSV file into the managed PostgreSQL database cluster.
This guide illustrates the process of copying data from a CSV file into a managed PostgreSQL database cluster via Ubuntu 20.04 server. If you've several CSV files, repeat the above process to load all files into the respective tables.
Check out the following guides to learn more about Vultr's managed databases: