A basic outline for installation is:
1. Install the master replicator to extract information from your transactional store.
2. Install the slave replicator to apply data into HDFS within your Hadoop cluster.
3. Download the Hadoop tools from https://github.com/continuent/continuent-tools-hadoop/
This tool provides 5 separate elements of functionality:
a. Generates staging table DDL within Hive
b. Generates live table DDL within Hive
c. Generates a suitable Sqoop statement to provision any existent data.
d. Performs a materialisation of the tables from the change data into the carbon copy tables
e. Performs a data comparison, comparing the current live transactional table and Hive tables.
Full details on the process are documented in the documentation: https://docs.continuent.com/tungsten-replicator-3.0/deployment-hadoop.html
The current status of replication can be checked at any time using the 'trepctl status' command. If the replicators are running, and the sequence numbers on both the master and slave match, then the replicators are operating normally.
To verify data is consistent, use the 'dc' tool that is part of the Github hadoop tools repository; it will compare data on the source and Hadoop stores.