Recently, I had a Siebel Clinical Trial Management System (CTMS) client requirement to load data into the database, clearly the purview of Siebel Enterprise Integration Manager (EIM). The client sent me the data in a spreadsheet, and it looked something like this:
Site_Number | Contact_ID | Role |
800,1,300,500 | 1-2ASW | Pharmacist |
6,7,2 | 1-4RSW | Doctor |
1,2,4,6,7,8 | 1-5E33 | Lead CRA |
In order to use SQL Loader to get the data into the EIM tables (in an Oracle database), I needed to massage the data into a form that looks like this:
Site_Number | Contact_ID | Role |
800 | 1-2ASW | Pharmacist |
1 | 1-2ASW | Pharmacist |
300 | 1-2ASW | Pharmacist |
500 | 1-2ASW | Pharmacist |
6 | 1-4RSW | Doctor |
7 | 1-4RSW | Doctor |
2 | 1-4RSW | Doctor |
6,7,2 | 1-4RSW | Doctor |
1 | 1-5E33 | Lead CRA |
2 | 1-5E33 | Lead CRA |
4 | 1-5E33 | Lead CRA |
6 | 1-5E33 | Lead CRA |
7 | 1-5E33 | Lead CRA |
8 | 1-5E33 | Lead CRA |
But, how do I do this? Enter open source PostgreSQL 10!
The first step is to save the Excel file as a CSV file.
The second step is issuing this statement in PostgreSQL:
CREATE FOREIGN TABLE aft_staging
(site_number text,
Contact_id text,
Role text)
Server pglog
OPTIONS (filename ‘/Users/reneschmidt/inputfile.csv’, format ‘csv’, header ‘true’);
This allows me to issue SQL statements from PostgreSQL using data that is an external file. So, for example, I can issue this command; select * from customer_staging:
The next step is to use the super handy unnext function to transform the data:
select contact_id,person_role,ltrim(unnest(string_to_array(site_number, ‘,’))) site from customer_staging;
The final step is to export the data. I did it by issuing this command:
\copy (select contact_id,person_role,ltrim(unnest(string_to_array(site_number, ‘,’))) site from customer_staging) TO ‘/Users/reneschmidt/outputfile.csv’ WITH (FORMAT CSV);
Opening the file shows this: