`
bit1129
  • 浏览: 1052059 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

【HBase十】HBase存储文件HFile剖析

 
阅读更多

1. 首先看看HBase中存储的文件内容

执行如下命令添加测试数据:

create 'table3', 'colfam1', { SPLITS => ['row-300', 'row-500', 'row-700' , 'row-900'] }

 

for i in '0'..'9' do for j in '0'..'9' do for k in '0'..'9' do put 'table3', "row-#{i}#{j}#{k}", "colfam1:#{j}#{k}", "#{j}#{k}" end end end

 

将数据从MemStore刷到磁盘中

flush 'table3'

 

再次执行一次:

for i in '0'..'9' do for j in '0'..'9' do for k in '0'..'9' do put 'table3', "row-#{i}#{j}#{k}", "colfam1:#{j}#{k}", "#{j}#{k}" end end end

 

 

然后在hbase命令行中执行如下命令

[hadoop@hadoop bin]$ ./hbase org.apache.hadoop.hbase.io.hfile.HFile -f /hbase/data/default/table3/1fa2e49c7404d3cd39afc39a99cc1c26/colfam1/0f6fc234c3014b6e9d84d3cae065d1b4 -v -m -p

 其中:

1fa2e49c7404d3cd39afc39a99cc1c26表示region名字,0f6fc234c3014b6e9d84d3cae065d1b4表示一个HFile的名字

打印结果:

Scanning -> /hbase/data/default/table3/1fa2e49c7404d3cd39afc39a99cc1c26/colfam1/0f6fc234c3014b6e9d84d3cae065d1b4
2015-04-09 22:53:01,918 INFO  [main] hfile.CacheConfig: CacheConfig:disabled

///注释:K:和V:表示HFile中的KV数据对,从下面的输出中可以看到,每个K都占用比较多的字节数,它是由rowKey,column(family:columnName)...组成
///The actual data stored as serialized KeyValue instances
K: row-500/colfam1:00/1428632364152/Put/vlen=2/seqid=5 V: 00
K: row-501/colfam1:01/1428632364177/Put/vlen=2/seqid=7 V: 01
K: row-502/colfam1:02/1428632364204/Put/vlen=2/seqid=9 V: 02
K: row-503/colfam1:03/1428632364287/Put/vlen=2/seqid=11 V: 03
K: row-504/colfam1:04/1428632364309/Put/vlen=2/seqid=13 V: 04
K: row-505/colfam1:05/1428632364318/Put/vlen=2/seqid=15 V: 05
K: row-506/colfam1:06/1428632364330/Put/vlen=2/seqid=17 V: 06
K: row-507/colfam1:07/1428632364351/Put/vlen=2/seqid=19 V: 07
K: row-508/colfam1:08/1428632364361/Put/vlen=2/seqid=21 V: 08
K: row-509/colfam1:09/1428632364381/Put/vlen=2/seqid=23 V: 09
K: row-510/colfam1:10/1428632364400/Put/vlen=2/seqid=25 V: 10
K: row-511/colfam1:11/1428632364411/Put/vlen=2/seqid=27 V: 11
K: row-512/colfam1:12/1428632364426/Put/vlen=2/seqid=29 V: 12
K: row-513/colfam1:13/1428632364440/Put/vlen=2/seqid=31 V: 13
K: row-514/colfam1:14/1428632364474/Put/vlen=2/seqid=33 V: 14
K: row-515/colfam1:15/1428632364496/Put/vlen=2/seqid=35 V: 15
K: row-516/colfam1:16/1428632364521/Put/vlen=2/seqid=37 V: 16
K: row-517/colfam1:17/1428632364528/Put/vlen=2/seqid=39 V: 17
K: row-518/colfam1:18/1428632364539/Put/vlen=2/seqid=41 V: 18
K: row-519/colfam1:19/1428632364551/Put/vlen=2/seqid=43 V: 19
K: row-520/colfam1:20/1428632364561/Put/vlen=2/seqid=45 V: 20
K: row-521/colfam1:21/1428632364574/Put/vlen=2/seqid=47 V: 21
K: row-522/colfam1:22/1428632364589/Put/vlen=2/seqid=49 V: 22
K: row-523/colfam1:23/1428632364602/Put/vlen=2/seqid=51 V: 23
K: row-524/colfam1:24/1428632364617/Put/vlen=2/seqid=53 V: 24
K: row-525/colfam1:25/1428632364634/Put/vlen=2/seqid=55 V: 25
K: row-526/colfam1:26/1428632364647/Put/vlen=2/seqid=57 V: 26
K: row-527/colfam1:27/1428632364653/Put/vlen=2/seqid=59 V: 27
K: row-528/colfam1:28/1428632364665/Put/vlen=2/seqid=61 V: 28
K: row-529/colfam1:29/1428632364734/Put/vlen=2/seqid=63 V: 29
K: row-530/colfam1:30/1428632364746/Put/vlen=2/seqid=65 V: 30
K: row-531/colfam1:31/1428632364760/Put/vlen=2/seqid=67 V: 31
K: row-532/colfam1:32/1428632364777/Put/vlen=2/seqid=69 V: 32
K: row-533/colfam1:33/1428632364819/Put/vlen=2/seqid=71 V: 33
K: row-534/colfam1:34/1428632364831/Put/vlen=2/seqid=73 V: 34
K: row-535/colfam1:35/1428632364837/Put/vlen=2/seqid=75 V: 35
K: row-536/colfam1:36/1428632364846/Put/vlen=2/seqid=77 V: 36
K: row-537/colfam1:37/1428632364852/Put/vlen=2/seqid=79 V: 37
K: row-538/colfam1:38/1428632364861/Put/vlen=2/seqid=81 V: 38
K: row-539/colfam1:39/1428632364872/Put/vlen=2/seqid=83 V: 39
K: row-540/colfam1:40/1428632364880/Put/vlen=2/seqid=85 V: 40
K: row-541/colfam1:41/1428632364886/Put/vlen=2/seqid=87 V: 41
K: row-542/colfam1:42/1428632364897/Put/vlen=2/seqid=89 V: 42
K: row-543/colfam1:43/1428632364909/Put/vlen=2/seqid=91 V: 43
K: row-544/colfam1:44/1428632364924/Put/vlen=2/seqid=93 V: 44
K: row-545/colfam1:45/1428632364937/Put/vlen=2/seqid=95 V: 45
K: row-546/colfam1:46/1428632364946/Put/vlen=2/seqid=97 V: 46
K: row-547/colfam1:47/1428632364955/Put/vlen=2/seqid=99 V: 47
K: row-548/colfam1:48/1428632364964/Put/vlen=2/seqid=101 V: 48
K: row-549/colfam1:49/1428632364976/Put/vlen=2/seqid=103 V: 49
K: row-550/colfam1:50/1428632364982/Put/vlen=2/seqid=105 V: 50
K: row-551/colfam1:51/1428632364992/Put/vlen=2/seqid=107 V: 51
K: row-552/colfam1:52/1428632365001/Put/vlen=2/seqid=109 V: 52
K: row-553/colfam1:53/1428632365011/Put/vlen=2/seqid=111 V: 53
K: row-554/colfam1:54/1428632365020/Put/vlen=2/seqid=113 V: 54
K: row-555/colfam1:55/1428632365035/Put/vlen=2/seqid=115 V: 55
K: row-556/colfam1:56/1428632365048/Put/vlen=2/seqid=117 V: 56
K: row-557/colfam1:57/1428632365056/Put/vlen=2/seqid=119 V: 57
K: row-558/colfam1:58/1428632365064/Put/vlen=2/seqid=121 V: 58
K: row-559/colfam1:59/1428632365080/Put/vlen=2/seqid=123 V: 59
K: row-560/colfam1:60/1428632365095/Put/vlen=2/seqid=125 V: 60
K: row-561/colfam1:61/1428632365111/Put/vlen=2/seqid=127 V: 61
K: row-562/colfam1:62/1428632365123/Put/vlen=2/seqid=129 V: 62
K: row-563/colfam1:63/1428632365133/Put/vlen=2/seqid=131 V: 63
K: row-564/colfam1:64/1428632365142/Put/vlen=2/seqid=133 V: 64
K: row-565/colfam1:65/1428632365151/Put/vlen=2/seqid=135 V: 65
K: row-566/colfam1:66/1428632365159/Put/vlen=2/seqid=137 V: 66
K: row-567/colfam1:67/1428632365169/Put/vlen=2/seqid=139 V: 67
K: row-568/colfam1:68/1428632365179/Put/vlen=2/seqid=141 V: 68
K: row-569/colfam1:69/1428632365192/Put/vlen=2/seqid=143 V: 69
K: row-570/colfam1:70/1428632365200/Put/vlen=2/seqid=145 V: 70
K: row-571/colfam1:71/1428632365209/Put/vlen=2/seqid=147 V: 71
K: row-572/colfam1:72/1428632365217/Put/vlen=2/seqid=149 V: 72
K: row-573/colfam1:73/1428632365226/Put/vlen=2/seqid=151 V: 73
K: row-574/colfam1:74/1428632365237/Put/vlen=2/seqid=153 V: 74
K: row-575/colfam1:75/1428632365245/Put/vlen=2/seqid=155 V: 75
K: row-576/colfam1:76/1428632365253/Put/vlen=2/seqid=157 V: 76
K: row-577/colfam1:77/1428632365265/Put/vlen=2/seqid=159 V: 77
K: row-578/colfam1:78/1428632365279/Put/vlen=2/seqid=161 V: 78
K: row-579/colfam1:79/1428632365287/Put/vlen=2/seqid=163 V: 79
K: row-580/colfam1:80/1428632365294/Put/vlen=2/seqid=165 V: 80
K: row-581/colfam1:81/1428632365305/Put/vlen=2/seqid=167 V: 81
K: row-582/colfam1:82/1428632365314/Put/vlen=2/seqid=169 V: 82
K: row-583/colfam1:83/1428632365321/Put/vlen=2/seqid=171 V: 83
K: row-584/colfam1:84/1428632365343/Put/vlen=2/seqid=173 V: 84
K: row-585/colfam1:85/1428632365352/Put/vlen=2/seqid=175 V: 85
K: row-586/colfam1:86/1428632365375/Put/vlen=2/seqid=177 V: 86
K: row-587/colfam1:87/1428632365535/Put/vlen=2/seqid=179 V: 87
K: row-588/colfam1:88/1428632365560/Put/vlen=2/seqid=181 V: 88
K: row-589/colfam1:89/1428632365569/Put/vlen=2/seqid=183 V: 89
K: row-590/colfam1:90/1428632365582/Put/vlen=2/seqid=185 V: 90
K: row-591/colfam1:91/1428632365594/Put/vlen=2/seqid=187 V: 91
K: row-592/colfam1:92/1428632365620/Put/vlen=2/seqid=189 V: 92
K: row-593/colfam1:93/1428632365633/Put/vlen=2/seqid=191 V: 93
K: row-594/colfam1:94/1428632365642/Put/vlen=2/seqid=193 V: 94
K: row-595/colfam1:95/1428632365651/Put/vlen=2/seqid=195 V: 95
K: row-596/colfam1:96/1428632365671/Put/vlen=2/seqid=197 V: 96
K: row-597/colfam1:97/1428632365679/Put/vlen=2/seqid=199 V: 97
K: row-598/colfam1:98/1428632365684/Put/vlen=2/seqid=201 V: 98
K: row-599/colfam1:99/1428632365689/Put/vlen=2/seqid=203 V: 99
K: row-600/colfam1:00/1428632365694/Put/vlen=2/seqid=205 V: 00
K: row-601/colfam1:01/1428632365702/Put/vlen=2/seqid=207 V: 01
K: row-602/colfam1:02/1428632365709/Put/vlen=2/seqid=209 V: 02
K: row-603/colfam1:03/1428632365717/Put/vlen=2/seqid=211 V: 03
K: row-604/colfam1:04/1428632365722/Put/vlen=2/seqid=213 V: 04
K: row-605/colfam1:05/1428632365729/Put/vlen=2/seqid=215 V: 05
K: row-606/colfam1:06/1428632365752/Put/vlen=2/seqid=217 V: 06
K: row-607/colfam1:07/1428632365758/Put/vlen=2/seqid=219 V: 07
K: row-608/colfam1:08/1428632365765/Put/vlen=2/seqid=221 V: 08
K: row-609/colfam1:09/1428632365773/Put/vlen=2/seqid=223 V: 09
K: row-610/colfam1:10/1428632365778/Put/vlen=2/seqid=225 V: 10
K: row-611/colfam1:11/1428632365785/Put/vlen=2/seqid=227 V: 11
K: row-612/colfam1:12/1428632365791/Put/vlen=2/seqid=229 V: 12
K: row-613/colfam1:13/1428632365798/Put/vlen=2/seqid=231 V: 13
K: row-614/colfam1:14/1428632365803/Put/vlen=2/seqid=233 V: 14
K: row-615/colfam1:15/1428632365811/Put/vlen=2/seqid=235 V: 15
K: row-616/colfam1:16/1428632365820/Put/vlen=2/seqid=237 V: 16
K: row-617/colfam1:17/1428632365834/Put/vlen=2/seqid=239 V: 17
K: row-618/colfam1:18/1428632365840/Put/vlen=2/seqid=241 V: 18
K: row-619/colfam1:19/1428632365850/Put/vlen=2/seqid=243 V: 19
K: row-620/colfam1:20/1428632365856/Put/vlen=2/seqid=245 V: 20
K: row-621/colfam1:21/1428632365864/Put/vlen=2/seqid=247 V: 21
K: row-622/colfam1:22/1428632365874/Put/vlen=2/seqid=249 V: 22
K: row-623/colfam1:23/1428632365882/Put/vlen=2/seqid=251 V: 23
K: row-624/colfam1:24/1428632365896/Put/vlen=2/seqid=253 V: 24
K: row-625/colfam1:25/1428632365903/Put/vlen=2/seqid=255 V: 25
K: row-626/colfam1:26/1428632365908/Put/vlen=2/seqid=257 V: 26
K: row-627/colfam1:27/1428632365917/Put/vlen=2/seqid=259 V: 27
K: row-628/colfam1:28/1428632365928/Put/vlen=2/seqid=261 V: 28
K: row-629/colfam1:29/1428632365934/Put/vlen=2/seqid=263 V: 29
K: row-630/colfam1:30/1428632365940/Put/vlen=2/seqid=265 V: 30
K: row-631/colfam1:31/1428632365945/Put/vlen=2/seqid=267 V: 31
K: row-632/colfam1:32/1428632365957/Put/vlen=2/seqid=269 V: 32
K: row-633/colfam1:33/1428632365967/Put/vlen=2/seqid=271 V: 33
K: row-634/colfam1:34/1428632365982/Put/vlen=2/seqid=273 V: 34
K: row-635/colfam1:35/1428632365999/Put/vlen=2/seqid=275 V: 35
K: row-636/colfam1:36/1428632366004/Put/vlen=2/seqid=277 V: 36
K: row-637/colfam1:37/1428632366020/Put/vlen=2/seqid=279 V: 37
K: row-638/colfam1:38/1428632366031/Put/vlen=2/seqid=281 V: 38
K: row-639/colfam1:39/1428632366038/Put/vlen=2/seqid=283 V: 39
K: row-640/colfam1:40/1428632366048/Put/vlen=2/seqid=285 V: 40
K: row-641/colfam1:41/1428632366057/Put/vlen=2/seqid=287 V: 41
K: row-642/colfam1:42/1428632366240/Put/vlen=2/seqid=289 V: 42
K: row-643/colfam1:43/1428632366249/Put/vlen=2/seqid=291 V: 43
K: row-644/colfam1:44/1428632366256/Put/vlen=2/seqid=293 V: 44
K: row-645/colfam1:45/1428632366264/Put/vlen=2/seqid=295 V: 45
K: row-646/colfam1:46/1428632366270/Put/vlen=2/seqid=297 V: 46
K: row-647/colfam1:47/1428632366276/Put/vlen=2/seqid=299 V: 47
K: row-648/colfam1:48/1428632366284/Put/vlen=2/seqid=301 V: 48
K: row-649/colfam1:49/1428632366290/Put/vlen=2/seqid=303 V: 49
K: row-650/colfam1:50/1428632366300/Put/vlen=2/seqid=305 V: 50
K: row-651/colfam1:51/1428632366305/Put/vlen=2/seqid=307 V: 51
K: row-652/colfam1:52/1428632366313/Put/vlen=2/seqid=309 V: 52
K: row-653/colfam1:53/1428632366321/Put/vlen=2/seqid=311 V: 53
K: row-654/colfam1:54/1428632366330/Put/vlen=2/seqid=313 V: 54
K: row-655/colfam1:55/1428632366337/Put/vlen=2/seqid=315 V: 55
K: row-656/colfam1:56/1428632366343/Put/vlen=2/seqid=317 V: 56
K: row-657/colfam1:57/1428632366350/Put/vlen=2/seqid=319 V: 57
K: row-658/colfam1:58/1428632366363/Put/vlen=2/seqid=321 V: 58
K: row-659/colfam1:59/1428632366370/Put/vlen=2/seqid=323 V: 59
K: row-660/colfam1:60/1428632366384/Put/vlen=2/seqid=325 V: 60
K: row-661/colfam1:61/1428632366392/Put/vlen=2/seqid=327 V: 61
K: row-662/colfam1:62/1428632366397/Put/vlen=2/seqid=329 V: 62
K: row-663/colfam1:63/1428632366403/Put/vlen=2/seqid=331 V: 63
K: row-664/colfam1:64/1428632366410/Put/vlen=2/seqid=333 V: 64
K: row-665/colfam1:65/1428632366421/Put/vlen=2/seqid=335 V: 65
K: row-666/colfam1:66/1428632366430/Put/vlen=2/seqid=337 V: 66
K: row-667/colfam1:67/1428632366437/Put/vlen=2/seqid=339 V: 67
K: row-668/colfam1:68/1428632366444/Put/vlen=2/seqid=341 V: 68
K: row-669/colfam1:69/1428632366461/Put/vlen=2/seqid=343 V: 69
K: row-670/colfam1:70/1428632366477/Put/vlen=2/seqid=345 V: 70
K: row-671/colfam1:71/1428632366487/Put/vlen=2/seqid=347 V: 71
K: row-672/colfam1:72/1428632366498/Put/vlen=2/seqid=349 V: 72
K: row-673/colfam1:73/1428632366507/Put/vlen=2/seqid=351 V: 73
K: row-674/colfam1:74/1428632366520/Put/vlen=2/seqid=353 V: 74
K: row-675/colfam1:75/1428632366530/Put/vlen=2/seqid=355 V: 75
K: row-676/colfam1:76/1428632366542/Put/vlen=2/seqid=357 V: 76
K: row-677/colfam1:77/1428632366555/Put/vlen=2/seqid=359 V: 77
K: row-678/colfam1:78/1428632366578/Put/vlen=2/seqid=361 V: 78
K: row-679/colfam1:79/1428632366588/Put/vlen=2/seqid=363 V: 79
K: row-680/colfam1:80/1428632366596/Put/vlen=2/seqid=365 V: 80
K: row-681/colfam1:81/1428632366604/Put/vlen=2/seqid=367 V: 81
K: row-682/colfam1:82/1428632366617/Put/vlen=2/seqid=369 V: 82
K: row-683/colfam1:83/1428632366629/Put/vlen=2/seqid=371 V: 83
K: row-684/colfam1:84/1428632366640/Put/vlen=2/seqid=373 V: 84
K: row-685/colfam1:85/1428632366649/Put/vlen=2/seqid=375 V: 85
K: row-686/colfam1:86/1428632366658/Put/vlen=2/seqid=377 V: 86
K: row-687/colfam1:87/1428632366664/Put/vlen=2/seqid=379 V: 87
K: row-688/colfam1:88/1428632366673/Put/vlen=2/seqid=381 V: 88
K: row-689/colfam1:89/1428632366680/Put/vlen=2/seqid=383 V: 89
K: row-690/colfam1:90/1428632366686/Put/vlen=2/seqid=385 V: 90
K: row-691/colfam1:91/1428632366693/Put/vlen=2/seqid=387 V: 91
K: row-692/colfam1:92/1428632366701/Put/vlen=2/seqid=389 V: 92
K: row-693/colfam1:93/1428632366857/Put/vlen=2/seqid=391 V: 93
K: row-694/colfam1:94/1428632366868/Put/vlen=2/seqid=393 V: 94
K: row-695/colfam1:95/1428632366873/Put/vlen=2/seqid=395 V: 95
K: row-696/colfam1:96/1428632366881/Put/vlen=2/seqid=397 V: 96
K: row-697/colfam1:97/1428632366890/Put/vlen=2/seqid=399 V: 97
K: row-698/colfam1:98/1428632366896/Put/vlen=2/seqid=401 V: 98
K: row-699/colfam1:99/1428632366902/Put/vlen=2/seqid=403 V: 99
Block index size as per heapsize: 400

///dumps the internal HFile.Reader properties
reader=/hbase/data/default/table3/1fa2e49c7404d3cd39afc39a99cc1c26/colfam1/0f6fc234c3014b6e9d84d3cae065d1b4,
    compression=none,
    cacheConf=CacheConfig:disabled,
    firstKey=row-500/colfam1:00/1428632364152/Put,
    lastKey=row-699/colfam1:99/1428632366902/Put,
    avgKeyLen=28,
    avgValueLen=2,
    entries=200,
    length=13581

///Trailer块信息
Trailer:
    fileinfoOffset=8857,
    loadOnOpenDataOffset=8742,
    dataIndexCount=1,
    metaIndexCount=0,
    totalUncomressedBytes=13483,
    entryCount=200,
    compressionCodec=NONE,
    uncompressedDataIndexSize=41,
    numDataIndexLevels=1,
    firstDataBlockOffset=0,
    lastDataBlockOffset=0,
    comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator,
    encryptionKey=NONE,
    majorVersion=3,
    minorVersion=0

///FileInfo块信息
Fileinfo:
    BLOOM_FILTER_TYPE = ROW
    DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00
    EARLIEST_PUT_TS = \x00\x00\x01L\xA1\x1F\xE4x
    KEY_VALUE_VERSION = \x00\x00\x00\x01
    LAST_BLOOM_KEY = row-699
    MAJOR_COMPACTION_KEY = \x00
    MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x01\x93
    MAX_SEQ_ID_KEY = 404
    TIMERANGE = 1428632364152....1428632366902
    hfile.AVG_KEY_LEN = 28
    hfile.AVG_VALUE_LEN = 2
    hfile.LASTKEY = \x00\x07row-699\x07colfam199\x00\x00\x01L\xA1\x1F\xEF6\x04
    hfile.MAX_TAGS_LEN = \x00\x00\x00\x00
    hfile.TAGS_COMPRESSED = \x00
Mid-key: \x00\x07row-500\x07colfam100\x00\x00\x01L\xA1\x1F\xE4x\x04
Bloom filter:
    BloomSize: 256
    No of Keys in bloom: 200
    Max Keys for bloom: 213
    Percentage filled: 94%
    Number of chunks: 1
    Comparator: RawBytesComparator
Delete Family Bloom filter:
    Not present
///查询到的数据KV总数
Scanned kv count -> 200

 

2. KeyValue的Format

在HFile中,KeyValue是一个字节数组,由如下信息组成

 

 

3. HFile数据结构


3.1 Trailer数据块
      Trailer是定长的,如图中所示,Trailer中有指针指向其他数据块的起始点,读取一个HFile时,会首先读取Trailer,然后DataBlock Index会被读取到内存中,这样当检索某个key时,不需要扫描整个HFile,而只需从内存中找到key所在的block,通过一次磁盘io将整个block读取到内存中,再找到需要的key。

 

3.2 File Info数据块
File Info数据块是定长的,记录了文件的一些Meta信息,例如:AVG_KEY_LEN, AVG_VALUE_LEN,LAST_KEY, COMPARATOR, MAX_SEQ_ID_KEY等。

 

3.3 Data Block
Data Block保存表中的数据,是HBase I/O的基本单元,为了提高效率,HRegionServer中有基于LRU的block cache机制。每个Data块的大小可以在创建一个table的时候通过参数指定,大号的block有利于顺序scan,小号block利于随机查询。

每个Data块除了开头的Magic以外就是一个个KeyValue对拼接而成,Magic内容就是一些随机数字,目的是防止数据损坏。每个块都有一个魔数

 

 

关于Data Block的块大小

Minimum block size. We recommend a setting of minimum block size between 8KB to 1MB for general usage. Larger block size is preferred if files are primarily for sequential
access. However, it would lead to inefficient random access (because there are more data to decompress). Smaller blocks are good for random access, but require more memory
to hold the block index, and may be slower to create (because we must flush the compressor stream at the conclusion of each data block, which leads to an FS I/O flush).
Further, due to the internal caching in Compression codec, the smallest possible block size would be around 20KB-30KB.

 

 

3.4 Meta Block段(可选的):
保存用户自定义的KeyValue对,可以被压缩。
Data Block Index段:
Data Block的索引,每条索引的key是被索引的block的第一条记录的key。The index blocks record the offsets of the data and meta blocks

 

4. HFile与HDFS Block的关系

HFile的块大小默认是64k,而HDFS的块大小默认是64M,因此,HDFS的块大小是HFile的块大小的1024倍,下图展现了232M

 

HFile中的块存放到HDFS的块中

 

5. HFile Compact

数据写入流程: Client写入 -> 存入MemStore,一直到MemStore满 -> Flush成一个StoreFile,StoreFile数目直至增长到一定阈值 -> 触发Compact合并操作 -> 多个StoreFile合并成一个StoreFile,同时进行版本合并和数据删除 -> 当StoreFiles Compact后,逐步形成越来越大的StoreFile -> 单个StoreFile大小超过一定阈值后,触发Split操作,把当前Region Split成2个Region,Region会下线,新Split出的2个孩子Region会被HMaster分配到相应的HRegionServer上,使得原先1个Region的压力得以分流到2个Region上。由此过程可知,HBase只是增加数据,有所得更新和删除操作,都是在Compact阶段做的,所以,用户写操作只需要进入到内存即可立即返回,从而保证I/O高性能。

 



 

 

 

 

 

 

 

 

 参考:http://blog.csdn.net/john_f_lau/article/details/18899311

  • 大小: 34.1 KB
  • 大小: 40.7 KB
  • 大小: 86.4 KB
  • 大小: 13.4 KB
  • 大小: 185.7 KB
  • 大小: 153.8 KB
分享到:
评论

相关推荐

    HBase+SpringBoot实战分布式文件存储

    HBase+SpringBoot实战分布式文件存储 资源永久有效哦

    java从本地读文件并上传Hbase

    java从本地读文件并上传Hbase

    大数据开发之Hbase基本使用及存储设计实战教程(视频+笔记+代码)

    │ Day1505_Hbase伪分布式配置文件的修改.mp4 │ Day1506_Hbase伪分布式的启动及hbase命令的使用.mp4 │ Day1507_Hbase shell中namespace的常用操作.mp4 │ Day1508_Hbase shell中表的DDL操作.mp4 │ Day1509_Hbase ...

    指导手册06:HBase安装部署 hbase配置文件

    指导手册06:HBase安装部署 hbase配置文件

    hbase存储csv数据

    简单的介绍了habse存储数据的样子和简单的hbase shell 使用

    hbase导出csv,文本,html文件

    通过条件查询hbase数据导出csv,文本,html等文件,实现方式:将hbase关联hive,然后将hive数据导入真实表,在将真实表数据导入sql数据库

    HBASE hfile v2 format

    hbase hfile v2 format draft 描述存储结构

    Hbase+Spring boot实战分布式文件存储

    分析实现一个对象存储服务的可行性,以及如何对技术进行选型,HBase可以做哪些工作等 7-1 老板提出一个需求 7-2 技术选型 第8章 功能梳理与方案设计 对我们需要实现的对象存储服务功能点进行梳理,并设计制定对象...

    Hbase配置所需要的配置文件.zip

    hbase配置需要的配置文件已配置好,可以直接拿来用

    HBase海量数据存储实战视频教程

    从HBase的集群搭建、HBaseshell操作、java编程、架构、原理、涉及的数据结构,并且结合陌陌海量消息存储案例来讲解实战HBase 课程亮点 1,知识体系完备,从小白到大神各阶段读者均能学有所获。 2,生动形象,化繁为...

    Hadoop数据迁移--从Hadoop向HBase载入数据

    一、将Hadoop中普通文本格式的数据转化为可被HBase识别的HFile文件,HFile相当于Oracle中的DBF数据 文件。 二、将HFile载入到HBase中,该过程实际就是将数据从一个地移动到HBase某表数据的存放地。

    HBase(hbase-2.4.9-bin.tar.gz)

    就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非...

    Hbase+Spring boot实战分布式文件存储.txt

    Hbase+Spring boot实战分布式文件存储,欢迎小伙伴们下载哦

    hbase安装文件

    就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非...

    hbase-1.4.9-bin.tar.gz

    hbase官方推荐稳定版1.4.9 HBase是建立在Hadoop文件系统之上的分布式面向列的...人们可以直接或通过HBase的存储HDFS数据。使用HBase在HDFS读取消费/随机访问数据。 HBase在Hadoop的文件系统之上,并提供了读写访问。

    HBase配置文件与HBase doc文档

    HBase配置文件与HBase doc文档

    2-1-HBase.pdf

    HBase是建立在Hadoop文件系统之上的分布式面向列的数据库。它是一个开源项目,是横向扩展的。 HBase是一个数据模型,类似于谷歌的大表设计,可以提供快速随机访问海量结构化数据。它利用了Hadoop的文件系统(HDFS)...

    HBase性能深度分析

    HBase性能深度分析HBase性能深度分析

    HbaseTemplate 操作hbase

    java 利用 sping-data-hadoop HbaseTemplate 操作hbase find get execute 等方法 可以直接运行

    安装HBase,并启动运行

    传HBase安装包 将准备好的HBase安装包上传到hadoop0结点的/opt/modules/softwares路径下 二....1. 修改/opt/modules/softwares/hbase-1.2.6/conf/hbase-env.sh文件,设置JAVA_HOME为实际jdk路径。

Global site tag (gtag.js) - Google Analytics