-
Notifications
You must be signed in to change notification settings - Fork 107
Installation
Below are instructions to quickly set up an environment to test Omid in your local machine.
- Java 7
- HBase 0.98
You can find HBase distributions in this page. Then start HBase in standalone mode.
2. Clone the Omid repository and Build the TSO Package:
$ git clone [email protected]:yahoo/omid.git
$ cd omid
$ mvn clean install assembly:single
This will generate a binary package containing all dependencies for the TSO in tso-server/target/tso-server-<VERSION>-bin.tar.gz. You can use -DskipTests
in the maven command to avoid passing the test suite. Unit tests coverage is quite extensive and take a while to run on each build (~40min at the moment of writing). So, consider using
mvn clean install -DskipTests
to speed temporal builds. Note that -Dmaven.test.skip=true
is NOT an equivalent.
As an alternative to clone the project, you can download the required version for the TSO tar.gz package from the release repository.
You can also see the build history here.
$ tar zxvf tso-server-<VERSION>-bin.tar.gz
$ cd tso-server-<VERSION>
Ensure that the setting for hbase.zookeeper.quorum in conf/hbase-site.xml points to your zookeeper instance, and create the Timestamp Table and the Commit Table using the omid.sh script included in the bin directory of the tso server:
$ bin/omid.sh create-hbase-commit-table -numSplits 16
$ bin/omid.sh create-hbase-timestamp-table
These two tables are required by Omid and they must not be accessed by client applications.
$ bin/omid.sh tso
This starts the TSO server that in turn will connect to HBase to store information in HBase. By default the TSO listens on port 54758. If you want to change the TSO configuration, you can modify the contents in the conf/omid.conf file
Use your favorite IDE an create a new project.
Choose the right version of the hbase-client jar. For example, in a Maven-based app add the following dependency in the pom.xml file:
<dependency>
<groupId>com.yahoo.omid</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase_client.version}</version>
</dependency>
In Omid there are two client interfaces: TTable
and TransactionManager
(These interfaces will likely change slightly in future.):
-
The
TransactionManager
is used for creating transactional contexts, that is, transactions. A builder is provided in theHBaseTransactionManager
class in order to get the TransactionManager interface. -
TTable
is used for putting, getting and scanning entries in a HBase table. TTable's interface is similar to the standardHTableInterface
, and only requires passing the transactional context as a first parameter in the transactional aware methods (e.g.put(Transaction tx, Put put)
)
Below is provided a sample application accessing data transactionally. Its a dummy application that writes two cells in two different rows of a table in a transactional context, but is enough to show how the different Omid client APIs are used. A detailed explanation of the client interfaces can be found in the Basic Examples section.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import com.yahoo.omid.transaction.HBaseTransactionManager;
import com.yahoo.omid.transaction.TTable;
import com.yahoo.omid.transaction.Transaction;
import com.yahoo.omid.transaction.TransactionManager;
public class OmidExample {
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
conf.set("tso.host", "localhost");
conf.setInt("tso.port", 54758);
TransactionManager tm = HBaseTransactionManager.newBuilder()
.withConfiguration(conf)
.build();
TTable tt = new TTable(conf, "MY_TX_TABLE");
byte[] exampleRow1 = Bytes.toBytes("EXAMPLE_ROW1");
byte[] exampleRow2 = Bytes.toBytes("EXAMPLE_ROW2");
byte[] family = Bytes.toBytes("MY_CF");
byte[] qualifier = Bytes.toBytes("foo");
byte[] dataValue1 = Bytes.toBytes("val1");
byte[] dataValue2 = Bytes.toBytes("val2");
Transaction tx = tm.begin();
Put row1 = new Put(exampleRow1);
row1.add(family, qualifier, dataValue1);
tt.put(tx, row1);
Put row2 = new Put(exampleRow2);
row2.add(family, qualifier, dataValue2);
tt.put(tx, row2);
tm.commit(tx);
tt.close();
tm.close();
}
}
To run the application, make sure core-site.xml
and hbase-site.xml
for your HBase cluster are present in your CLASSPATH. You will need to set tso.host
and tso.port
appropriately. Also, you will need to create a HBase table "MY_TX_TABLE", with column family "MY_CF", and with TTL
disabled and VERSIONS
set to Integer.MAX_VALUE
. For example using the HBase shell:
create 'MY_TX_TABLE', {NAME => 'MY_CF', VERSIONS => '2147483647', TTL => '2147483647'}
This example assumes non-secure communication with HBase. If your HBase cluster is secured with Kerberos, you will need to use the UserGroupInformation
API to log in securely.
Omid includes a jar with an HBase coprocessor for performing data cleanup that operates during compactions, both minor and major. Specifically, it does the following:
- Cleans up garbage data from aborted transactions
- Purges deleted cells. Omid deletes work by placing a special tombstone marker in cells. The compactor detects these and actually purges data when it is safe to do so (i.e. when there are no committable transactions that may read the data).
- 'Heals' committed cells for which the writer failed to write shadow cells.
To deploy the coprocessor, the coprocessor jar must be placed in a location (typically on HDFS) that is accessible by HBase region servers. The coprocessor may then be enabled on a transactional table by the following steps in the HBase shell:
1) Disable the table
disable 'MY_TX_TABLE'
2) Add a coprocessor specification to the table via a "coprocessor" attribute. The coprocessor spec may (and usually will) also include the name of the Omid commit table
alter 'MY_TX_TABLE', METHOD => 'table_att', 'coprocessor'=>'<path_to_omid_coprocessor>/omid-hbase-coprocessor-<coprocessor_version>.jar|com.yahoo.omid.transaction.OmidCompactor|1001|omid.committable.tablename=OMID_COMMIT_TABLE'
3) Add an "OMID_ENABLED => true" flag to any column families which the co-processor should work on
alter 'MY_TX_TABLE', { NAME => 'MY_CF', METADATA => {'OMID_ENABLED' => 'true'}}
4) Re-enable the table
enable 'MY_TX_TABLE'
Omid
Copyright 2011-2015 Yahoo Inc. Licensed under the Apache License, Version 2.0