在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:Tephra开源软件地址:https://gitee.com/apache/tephra开源软件介绍:Note: Tephra has moved to Apache IncubatorFor latest updates on Apache Tephra go to its new site at http://tephra.incubator.apache.org. Transactions for Apache HBase™:Cask Tephra provides globally consistent transactions on top of Apache HBase. While HBaseprovides strong consistency with row- or region-level ACID operations, it sacrificescross-region and cross-table consistency in favor of scalability. This trade-off requiresapplication developers to handle the complexity of ensuring consistency when their modificationsspan region boundaries. By providing support for global transactions that span regions, tables, ormultiple RPCs, Tephra simplifies application development on top of HBase, without a significantimpact on performance or scalability for many workloads. How It WorksTephra leverages HBase's native data versioning to provide multi-versioned concurrencycontrol (MVCC) for transactional reads and writes. With MVCC capability, each transactionsees its own consistent "snapshot" of data, providing snapshot isolation of concurrent transactions. Tephra consists of three main components:
Transaction ServerA central transaction manager generates a globally unique, time-based transaction ID for eachtransaction that is started, and maintains the state of all in-progress and recently committedtransactions for conflict detection. While multiple transaction server instances can be runconcurrently for automatic failover, only one server instance is actively serving requests at atime. This is coordinated by performing leader election amongst the running instances throughZooKeeper. The active transaction server instance will also register itself using a servicediscovery interface in ZooKeeper, allowing clients to discover the currently active serverinstance without additional configuration. Transaction ClientA client makes a call to the active transaction server in order to start a new transaction. Thisreturns a new transaction instance to the client, with a unique transaction ID (used to identifywrites for the transaction), as well as a list of transaction IDs to exclude for reads (fromin-progress or invalidated transactions). When performing writes, the client overrides thetimestamp for all modified HBase cells with the transaction ID. When reading data from HBase, theclient skips cells associated with any of the excluded transaction IDs. The read exclusions areapplied through a server-side filter injected by the TransactionProcessor CoprocessorThe More details on how Tephra transactions work and the interactions between these components can befound in our Transactions over HBase presentation. Is It Building?Status of continuous integration build at Travis CI: RequirementsJava RuntimeThe latest JDK or JRE version 1.7.xx or 1.8.xxfor Linux, Windows, or Mac OS X must be installed in your environment; we recommend the Oracle JDK. To check the Java version installed, run the command: $ java -version Tephra is tested with the Oracle JDKs; it may work with other JDKs such asOpen JDK, but it has not been tested with them. Once you have installed the JDK, you'll need to set the JAVA_HOME environment variable. Hadoop/HBase EnvironmentTephra requires a working HBase and HDFS environment in order to operate. Tephra supports thesecomponent versions:
Note: Components versions shown in this table are those that we have tested and areconfident of their suitability and compatibility. Later versions of components may work,but have not necessarily been either tested or confirmed compatible. Getting StartedYou can get started with Tephra by building directly from the latest source code: git clone https://github.com/caskdata/tephra.gitcd tephramvn clean package After the build completes, you will have a full binary distribution of Tephra under the For any client applications, add the following dependencies to any Apache Maven POM files (or yourbuild system's equivalent configuration), in order to make use of Tephra classes: <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-api</artifactId> <version>0.7.1</version></dependency><dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-core</artifactId> <version>0.7.1</version></dependency> Since the HBase APIs have changed between versions, you will need to select theappropriate HBase compatibility library. For HBase 0.96.x: <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-hbase-compat-0.96</artifactId> <version>0.7.1</version></dependency> For HBase 0.98.x: <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-hbase-compat-0.98</artifactId> <version>0.7.1</version></dependency> For HBase 1.0.x: <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-hbase-compat-1.0</artifactId> <version>0.7.1</version></dependency> If you are running the CDH 5.4, 5.5, or 5.6 version of HBase 1.0.x (this version contains API incompatibilitieswith Apache HBase 1.0.x): <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-hbase-compat-1.0-cdh</artifactId> <version>0.7.1</version></dependency> For HBase 1.1.x or CDH 5.7 version of HBase 1.2.x: <dependency> <groupId>co.cask.tephra</groupId> <artifactId>tephra-hbase-compat-1.1</artifactId> <version>0.7.1</version></dependency> Deployment and ConfigurationTephra makes use of a central transaction server to assign unique transaction IDs for datamodifications and to perform conflict detection. Only a single transaction server can activelyhandle client requests at a time, however, additional transaction server instances can be runsimultaneously, providing automatic failover if the active server becomes unreachable. Transaction Server ConfigurationThe Tephra transaction server can be deployed on the same cluster nodes running the HBase HMasterprocess. The transaction server requires that the HBase libraries be available on the server'sJava The transaction server supports the following configuration properties. All configurationproperties can be added to the
To run the Transaction server, execute the following command in your Tephra installation: ./bin/tephra start Any environment-specific customizations can be made by editing the Client ConfigurationSince Tephra clients will be communicating with HBase, the HBase client libraries and the HBase clusterconfiguration must be available on the client's Java Client API usage is described in the Client APIs section. The transaction service client supports the following configuration properties. All configurationproperties can be added to the
HBase Coprocessor ConfigurationIn addition to the transaction server, Tephra requires an HBase coprocessor to be installed on alltables where transactional reads and writes will be performed. To configure the coprocessor on all HBase tables, add the following to For HBase 0.96.x: <property> <name>hbase.coprocessor.region.classes</name> <value>co.cask.tephra.hbase96.coprocessor.TransactionProcessor</value></property> For HBase 0.98.x: <property> <name>hbase.coprocessor.region.classes</name> <value>co.cask.tephra.hbase98.coprocessor.TransactionProcessor</value></property> For HBase 1.0.x: <property> <name>hbase.coprocessor.region.classes</name> <value>co.cask.tephra.hbase10.coprocessor.TransactionProcessor</value></property> For the CDH 5.4, 5.5, or 5.6 version of HBase 1.0.x: <property> <name>hbase.coprocessor.region.classes</name> <value>co.cask.tephra.hbase10cdh.coprocessor.TransactionProcessor</value></property> For HBase 1.1.x or CDH 5.7 version of HBase 1.2.x: <property> <name>hbase.coprocessor.region.classes</name> <value>co.cask.tephra.hbase11.coprocessor.TransactionProcessor</value></property> You may configure the Using Existing HBase Tables TransactionallyTephra overrides HBase cell timestamps with transaction IDs, and uses these transactionIDs to filter out cells older than the TTL (Time-To-Live). Transaction IDs are at a higherscale than cell timestamps. When a regular HBase table that has existing data isconverted to a transactional table, existing data may be filtered out during reads. Toallow reading of existing data from a transactional table, you will need to set theproperty Note that even without the property Metrics ReportingTephra ships with built-in support for reporting metrics via JMX and a log file, using theDropwizard Metrics library. To enable JMX reporting for metrics, you will need to enable JMX in the Java runtimearguments. Edit the # export JMX_OPTS="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=13001"# export OPTS="$OPTS $JMX_OPTS" To enable file-based reporting for metrics, edit the <appender name="METRICS" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>/FILE-PATH/metrics.log</file> <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy"> <fileNamePattern>metrics.log.%d{yyyy-MM-dd}</fileNamePattern> <maxHistory>30</maxHistory> </rollingPolicy> <encoder> <pattern>%d{ISO8601} %msg%n</pattern> </encoder></appender><logger name="tephra-metrics" level="TRACE" additivity="false"> <appender-ref ref="METRICS" /></logger> The frequency of metrics reporting may be configured by setting the Client APIsThe
Other operations are not supported transactionally and will throw an
Note that for UsageTo use a TransactionContext context = new TransactionContext(client, transactionAwareHTable);try { context.start(); transactionAwareHTable.put(new Put(Bytes.toBytes("row")); // ... context.finish();} catch (TransactionFailureException e) { context.abort();}
ExampleTo demonstrate how you might use /** * A Transactional SecondaryIndexTable. */public class SecondaryIndexTable { private byte[] secondaryIndex; private TransactionAwareHTable transactionAwareHTable; private TransactionAwareHTable secondaryIndexTable; private TransactionContext transactionContext; private final TableName secondaryIndexTableName; private static final byte[] secondaryIndexFamily = Bytes.toBytes("secondaryIndexFamily"); private static final byte[] secondaryIndexQualifier = Bytes.toBytes('r'); private static final byte[] DELIMITER = new byte[] {0}; public SecondaryIndexTable(TransactionServiceClient transactionServiceClient, HTable hTable, byte[] secondaryIndex) { secondaryIndexTableName = TableName.valueOf(hTable.getName().getNameAsString() + ".idx"); HTable secondaryIndexHTable = null; HBaseAdmin hBaseAdmin = null; try { hBaseAdmin = new HBaseAdmin(hTable.getConfiguration()); if (!hBaseAdmin.tableExists(secondaryIndexTableName)) { hBaseAdmin.createTable(new HTableDescriptor(secondaryIndexTableName)); } secondaryIndexHTable = new HTable(hTable.getConfiguration(), secondaryIndexTableName); } catch (Exception e) { Throwables.propagate(e); } finally { try { hBaseAdmin.close(); } catch (Exception e) { Throwables.propagate(e); } } this.secondaryIndex = secondaryIndex; this.transactionAwareHTable = new TransactionAwareHTable(hTable); this.secondaryIndexTable = new TransactionAwareHTable(secondaryIndexHTable); this.transactionContext = new TransactionContext(transactionServiceClient, transactionAwareHTable, secondaryIndexTable); } public Result get(Get get) throws IOException { return get(Collections.singletonList(get))[0]; } public Result[] get(List<Get> gets) throws IOException { try { transactionContext.start(); Result[] result = transactionAwareHTable.get(gets); transactionContext.finish(); return result; } catch (Exception e) { try { transactionContext.abort(); } catch (TransactionFailureException e1) { throw new IOException("Could not rollback transaction", e1); } } return null; } public Result[] getByIndex(byte[] value) throws IOException { try { transactionContext.start(); Scan scan = new Scan(value, Bytes.add(value, new byte[0])); scan.addColumn(secondaryIndexFamily, secondaryIndexQualifier); ResultScanner indexScanner = secondaryIndexTable.getScanner(scan); ArrayList<Get> gets = new ArrayList<Get>(); for (Result result : indexScanner) { for (Cell cell : result.listCells()) { gets.add(new Get(cell.getValue())); } } Result[] results = transactionAwareHTable.get(gets); transactionContext.finish(); return results; } catch (Exception e) { try { transactionContext.abort(); } catch (TransactionFailureException e1) { throw new IOException("Could not rollback transaction", e1); } } return null; } public void put(Put put) throws IOException { put(Collections.singletonList(put)); } public void put(List<Put> puts) throws IOException { try { transactionContext.start(); ArrayList<Put> secondaryIndexPuts = new ArrayList<Put>(); for (Put put : puts) { List<Put> indexPuts = new ArrayList<Put>(); Set<Map.Entry<byte[], List<KeyValue>>> familyMap = put.getFamilyMap().entrySet(); for (Map.Entry<byte [], List<KeyValue>> family : familyMap) { for (KeyValue value : family.getValue()) { if (value.getQualifier().equals(secondaryIndex)) { byte[] secondaryRow = Bytes.add(value.getQualifier(), DELIMITER, Bytes.add(value.getValue(), DELIMITER, value.getRow())); Put indexPut = new Put(secondaryRow); indexPut.add(secondaryIndexFamily, secondaryIndexQualifier, put.getRow()); indexPuts.add(indexPut); } } } secondaryIndexPuts.addAll(indexPuts); } transactionAwareHTable.put(puts); secondaryIndexTable.put(secondaryIndexPuts); transactionContext.finish(); } catch (Exception e) { try { transactionContext.abort(); } catch (TransactionFailureException e1) { throw new IOException("Could not rollback transaction", e1); } } }} Known Issues and Limitations
How to ContributeInterested in helping to improve Tephra? We welcome all contributions, whether in filing detailedbug reports, submitting pull requests for code changes and improvements, or by asking questions andassisting others on the mailing list. Bug Reports & Feature RequestsBugs and tasks are tracked in a public JIRA issue tracker. Tephra User Groups and Mailing Lists
IRCHave questions about how Tephra works, or need help using it? Drop by the Pull RequestsWe have a simple pull-based development model with a consensus-building phase, similar to Apache'svoting process. If you’d like to help make Tephra better by adding new features, enhancing existingfeatures, or fixing bugs, here's how to do it:
Thanks for helping to improve Tephra! License and TrademarksLicensed under the Apache License, Version 2.0 (the "License"); you may not use this product exceptin compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the Licenseis distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either expressor implied. See the License for the specific language governing permissions and limitations underthe License. Cask, Cask Tephra and Tephra are trademarks of Cask Data, Inc. All rights reserved. Apache, Apache HBase, and HBase are trademarks of The Apache Software Foundation. Used with permission.No endorsement by The Apache Software Foundation is implied by the use of these marks. |
请发表评论