收藏

下载资源加入VIP,免费下载

HBase学习笔记HBase性能研究1Word文档格式.docx

上传人：b****7 文档编号：22378729 上传时间：2023-02-03 格式：DOCX 页数：21 大小：22.10KB

下载相关举报

HBase学习笔记HBase性能研究1Word文档格式.docx_第1页

第1页 / 共21页

HBase学习笔记HBase性能研究1Word文档格式.docx_第2页

第2页 / 共21页

HBase学习笔记HBase性能研究1Word文档格式.docx_第3页

第3页 / 共21页

HBase学习笔记HBase性能研究1Word文档格式.docx_第4页

第4页 / 共21页

HBase学习笔记HBase性能研究1Word文档格式.docx_第5页

第5页 / 共21页

点击查看更多>>

资源描述

HBase学习笔记HBase性能研究1Word文档格式.docx

《HBase学习笔记HBase性能研究1Word文档格式.docx》由会员分享，可在线阅读，更多相关《HBase学习笔记HBase性能研究1Word文档格式.docx（21页珍藏版）》请在冰豆网上搜索。

HBase学习笔记HBase性能研究1Word文档格式.docx

this.configuration=conf;

this.pool=getDefaultExecutor（conf）;

this.finishSetup（）;

注意红色部分的代码。

这种构造方法实际上调用了HConnectionManager的getConnection函数，来获取了一个HConnection对象。

一般使用JavaAPI进行数据库操作的时候，都会创建一个类似的connection对象来维护一些数据库连接相关的信息（熟悉odbc，jdbc的话这一块就没有理解问题）。

getConnection函数的具体实现如下：

publicstaticHConnectiongetConnection（finalConfigurationconf）

HConnectionKeyconnectionKey=newHConnectionKey（conf）;

synchronized（CONNECTION_INSTANCES）{

HConnectionImplementationconnection=CONNECTION_INSTANCES.get（connectionKey）;

if（connection==null）{

connection=（HConnectionImplementation）createConnection（conf,true）;

CONNECTION_INSTANCES.put（connectionKey,connection）;

}elseif（connection.isClosed（））{

HConnectionManager.deleteConnection（connectionKey,true）;

connection.incCount（）;

returnconnection;

}

其中，CONNECTION_INSTANCES的类型是LinkedHashMap<

HConnectionKey，HConnectionImplementation>

。

同样注意红色部分的三行代码。

第一行，根据conf信息创建了一个HConnectionKey的对象；

第二行，去CONNECTION_INSTANCES中查找是否存在刚才创建的HConnectionKey；

第三行，如果不存在，那么调用createConnection来创建一个HConnection的对象，否则直接返回刚才从Map中查找得到的HConnection对象

不嫌麻烦，再看一下HConnectionKey的构造函数和重写的hashCode函数，代码分别如下：

HConnectionKey（Configurationconf）{

Map<

String,String>

m=newHashMap<

（）;

if（conf!

=null）{

for（Stringproperty:

CONNECTION_PROPERTIES）{

Stringvalue=conf.get（property）;

if（value!

m.put（property,value）;

this.properties=Collections.unmodifiableMap（m）;

try{

UserProviderprovider=UserProvider.instantiate（conf）;

UsercurrentUser=provider.getCurrent（）;

if（currentUser!

username=currentUser.getName（）;

}catch（IOExceptionioe）{

HConnectionManager.LOG.warn（"

Errorobtainingcurrentuser,skippingusernameinHConnectionKey"

ioe）;

publicinthashCode（）{

finalintprime=31;

intresult=1;

if（username!

result=username.hashCode（）;

Stringvalue=properties.get（property）;

result=prime*result+value.hashCode（）;

returnresult;

可以看到，hashCode函数被重写以后，其返回值实际上是username的hashCode函数的返回值，而username来自于currentuser，currentuser又来自于provider，provider是由conf创建的。

可以看出，只要有相同的conf，就能创建出相同的username，也就能保证HConnectionKey的hashCode函数被重写以后，能够在username相同时返回相同的值。

而CONNECTION_INSTANCES是一个LinkedHashMap，其get函数会调用HConnectionKey的hashCode函数来判断该对象是否已经存在。

因此，getConnection函数的本质就是根据conf信息返回connection对象，对每一个内容相同的conf，只会返回一个connection

（２）调用createConnection方法来显式地创建connection，再使用connection来创建HTable对象。

createConnection方法和Htable对应的构造函数分别如下：

publicstaticHConnectioncreateConnection（Configurationconf）　throwsIOException{

returncreateConnection（conf,false,null,provider.getCurrent（））;

staticHConnectioncreateConnection（finalConfigurationconf,finalbooleanmanaged,finalExecutorServicepool,finalUseruser）

throwsIOException{

StringclassName=conf.get（"

hbase.client.connection.impl"

HConnectionManager.HConnectionImplementation.class.getName（））;

Class<

?

>

clazz=null;

clazz=Class.forName（className）;

}catch（ClassNotFoundExceptione）{

thrownewIOException（e）;

//DefaultHCM#HCIisnotaccessible;

makeitsobeforeinvoking.

Constructor<

constructor=

clazz.getDeclaredConstructor（Configuration.class,

boolean.class,ExecutorService.class,User.class）;

constructor.setAccessible（true）;

return（HConnection）constructor.newInstance（conf,managed,pool,user）;

}catch（Exceptione）{

publicHTable（TableNametableName,HConnectionconnection）throws　IOException{

this.cleanupPoolOnClose=true;

this.cleanupConnectionOnClose=false;

this.connection=connection;

this.configuration=connection.getConfiguration（）;

this.pool=getDefaultExecutor（this.configuration）;

可以看出，这样的话每次创建HTable对象，都需要创建一个新的HConnection对象，而不像方法（１）中那样共享一个HConnection对象。

那么，上述两种方法，在执行插入/删除/查找的时候，性能如何呢？

先从代码角度分析一下。

为了简便，先分析HTable在执行put（插入）操作时具体做的事情。

HTable的put函数如下：

publicvoidput（finalPutput）　throwsInterruptedIOException,RetriesExhaustedWithDetailsException{

doPut（put）;

if（autoFlush）{

flushCommits（）;

privatevoiddoPut（Putput）throwsInterruptedIOException,RetriesExhaustedWithDetailsException{

if（ap.hasError（））{

writeAsyncBuffer.add（put）;

backgroundFlushCommits（true）;

validatePut（put）;

currentWriteBufferSize+=put.heapSize（）;

while（currentWriteBufferSize>

writeBufferSize）{

backgroundFlushCommits（false）;

privatevoidbackgroundFlushCommits（booleansynchronous）throws　InterruptedIOException,RetriesExhaustedWithDetailsException{

do{

ap.submit（writeAsyncBuffer,true）;

}while（synchronous&

&

!

writeAsyncBuffer.isEmpty（））;

if（synchronous）{

ap.waitUntilDone（）;

if（ap.hasError（））{

LOG.debug（tableName+"

:

Oneormoreoftheoperationshavefailed-"

+

"

waitingforalloperationinprogresstofinish（successfullyornot）"

while（!

writeAsyncBuffer.isEmpty（））{

if（!

clearBufferOnFail）{

//ifclearBufferOnFailedisnotset,we'

resupposedtokeepthefailedoperationinthe

//writebuffer.Thisisaquestionablefeaturekepthereforbackwardcompatibility

writeAsyncBuffer.addAll（ap.getFailedOperations（））;

RetriesExhaustedWithDetailsExceptione=ap.getErrors（）;

ap.clearErrors（）;

throwe;

}finally{

currentWriteBufferSize=0;

for（Rowmut:

writeAsyncBuffer）{

if（mutinstanceofMutation）{

currentWriteBufferSize+=（（Mutation）mut）.heapSize（）;

如红色部分所表示，调用顺序是put->

doPut->

backgroundFlushCommits->

ap.submit，其中ap是类AsyncProcess的对象。

因此追踪到AsyncProcess类，其代码如下：

publicvoidsubmit（List<

extendsRow>

rows,booleanatLeastOne）throwsInterruptedIOException{

submitLowPriority（rows,atLeastOne,false）;

publicvoidsubmitLowPriority（List<

rows,booleanatLeastOne,booleanisLowPripority）throwsInterruptedIOException{

if（rows.isEmpty（））{

//ThislookslikewearekeyingbyregionbutHRegionLocationhasacomparatorthatcompares

//ontheserverportiononly（hostname+port）sothisMapcollectsregionsbyserver.

HRegionLocation,MultiAction<

Row>

actionsByServer=　newHashMap<

List<

Action<

retainedActions=newArrayList<

（rows.size（））;

longcurrentTaskCnt=tasksDone.get（）;

booleanalreadyLooped=false;

NonceGeneratorng=this.hConnection.getNonceGenerator（）;

if（alreadyLooped）{

//if,forwhateverreason,welooped,wewanttobesurethatsomethinghaschanged.

waitForNextTaskDone（currentTaskCnt）;

currentTaskCnt=tasksDone.get（）;

}else{

alreadyLooped=true;

//Waituntilthereisatleastoneslotforanewtask.

waitForMaximumCurrentTasks（maxTotalConcurrentTasks-1）;

//Rememberthepreviousdecisionsaboutregionsorregionserversweputinthe

//finalmulti.

Long,Boolean>

regionIncluded=newHashMap<

ServerName,Boolean>

serverIncluded=newHashMap<

intposInList=-1;

Iterator<

it=rows.iterator（）;

while（it.hasNext（））{

Rowr=it.next（）;

HRegionLocationloc=findDestLocation（r,posInList）;

if（loc==null）{//locisnullifthereisanerrorsuchasmetanotavailable.

it.remove（）;

}elseif（canTakeOperation（loc,regionIncluded,serverIncluded））{

Action<

action=newAction<

（r,++posInList）;

setNonce（ng,r,action）;

retainedActions.add（action）;

addAction（loc,action,actionsByServer,ng）;

}while（retainedActions.isEmpty（）&

atLeastOne&

hasError（））;

HConnectionManager.ServerErrorTrackererrorsByServer=createServerErrorTracker（）;

sendMultiAction（retainedActions,actionsByServer,1,errorsByServer,isLowPripority）;

privateHRegionLocationfindDestLocation（Rowrow,intposInList）{

if（row==null）thrownewIllegalArgumentException（"

#"

+id+"

rowcannotbenull"

HRegionLocationloc=null;

IOExceptionlocationException=null;

loc=hConnection.locateRegion（this.tableName,row.getRow（））;

if（loc==null）{

locationException=newIOException（"

nolocationfound,abortingsubmitfor"

tableName="

+tableName+

rowkey="

+Arrays.toString（row.getRow（）））;

}catch（IOExceptione）{

locationException=e;

if（locationException!

//TherearemultipleretriesinlocateRegionalready.Noneedtoaddnew.

//Wecan'

tcontinuewiththisrow,henceit'

sthelastretry.

manageError（posInList,row,false,locationException,null）;

returnnull;

returnloc;

这里代码的主要实现机制是异步调用，也就是说，并非每一次put操作都是直接往HBase里面写数据的，而是等到缓存区域内的数据多到一定程度（默认设置是２Ｍ），再进行一次写操作。

当然这次操作在Server端应当还是要排队执行的，具体执行机制这里不作展开

展开阅读全文

相关资源

猜你喜欢

相关搜索

当前位置：首页 > 小学教育 > 学科竞赛

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1