To implement a new grade control database at a mine site to contain data from a pre-existing and live exploration database. The data from the exploration database is used to inform resource data gathering and resource modelling at the mine site. However, there is a need for a separate database due to poor internet connectivity in the country and the fact that the exploration and grade control teams are separate functions in this organisation.
Two structurally identical DataShed (maxgeo Data Schema MDS) SQL Server databases are situated on the same company network in physically separate locations in a country with unreliable internet connectivity.
Only certain fields in selected tables and data belonging to two specific DataSets are to be replicated from the Source exploration database in the capital city to the Target database at the mine site.
The library tables in the Target database need to be maintained so that the main data that reference these libraries will be accepted.
The Source database also contains dirty data (violating check constraints and foreign keys but not primary keys) which would activate the data validation components of the Target database.
The Source database is actively being used and new data needs to be inserted in the Target database while modified records need to be updated and deleted records removed entirely.
Microsoft SQL Server Integration Services (SSIS) was selected as a suitable ETL platform for the solution because both Source and Target databases are MS SQL databases and the basic integration tools offered by SSIS would be adequate for the task and would not incur additional software costs for the client.
The SSIS solution consists of five packages controlled by a sixth master package to ensure that the various steps are executed in the correct order.
The SSIS master package is executed by a SQL Agent job scheduled to run multiple times in the early hours of the morning to allow for internet interruptions.
The first step is to copy the table that tracks all deleted records from the Source to the Target.
Then the affected library tables are synchronised in three successive steps. The reason for this being that the MDS makes use of three tiers of library tables that reference each other.
Then the main tables are copied across and the data synchronised for the selected columns and DataSets.
Finally, the solution would run the Assay Ranking stored procedure in the Target database to build the table tblAssayFlat and then export and email a summary report (.csv file) of the number of records inserted/updated/deleted to the database administrator.
The ETL process itself was the same for all library and main tables:
- Extract new or changed data from the source to temporary tables in the Target database
- Transform the data in the temporary tables (locking collar records to prevent editing)
- Load the prepared data from the temporary- to the main tables through appropriate SQL insert, update and delete
Due to the large volume of existing data needing to be replicated initially, an initial run was performed from a SQL backup of the Source to the Target on a local machine prior to its deployment at the mine site.
Subsequent monitoring of the solution showed that The SQL Agent job was running multiple times daily as expected and even when interrupted by connection problems, would resume where it left off to complete the process of replication. No matter how long the interruption lasts, the solution will simply continue from the point where it was interrupted.