HBase性能压测

[hadoop@MASTER $ hbase pe org.apache.hadoop.hbase.PerformanceEvaluation [-D]*  

Options:
nomapred Run multiple clients using threads (rather than use mapreduce)
rows Rows each client runs. Default: One million
size Total size in GiB. Mutually exclusive with –rows. Default: 1.0.
sampleRate Execute test on a sample of total rows. Only supported by randomRead. Default: 1.0
traceRate Enable HTrace spans. Initiate tracing every N rows. Default: 0
table Alternate table name. Default: ‘TestTable’
multiGet If >0, when doing RandomRead, perform multiple gets instead of single gets. Default: 0
compress Compression type to use (GZ, LZO, …). Default: ‘NONE’
flushCommits Used to determine if the test should flush the table. Default: false
writeToWAL Set writeToWAL on puts. Default: True
autoFlush Set autoFlush on htable. Default: False
oneCon all the threads share the same connection. Default: False
presplit Create presplit table. Recommended for accurate perf analysis (see guide). Default: disabled
inmemory Tries to keep the HFiles of the CF inmemory as far as possible. Not guaranteed that reads are always served from memory. Default: false
usetags Writes tags along with KVs. Use with HFile V3. Default: false
numoftags Specify the no of tags that would be needed. This works only if usetags is true.
filterAll Helps to filter out all the rows on the server side there by not returning any thing back to the client. Helps to check the server side performance. Uses FilterAllFilter internally.
latency Set to report operation latencies. Default: False
bloomFilter Bloom filter type, one of [NONE, ROW, ROWCOL
valueSize Pass value size to use: Default: 1024
valueRandom Set if we should vary value size between 0 and ‘valueSize’; set on read for stats on size: Default: Not set.
valueZipf Set if we should vary value size between 0 and ‘valueSize’ in zipf form: Default: Not set.
period Report every ‘period’ rows: Default: opts.perClientRunRows / 10
multiGet Batch gets together into groups of N. Only supported by randomRead. Default: disabled
addColumns Adds columns to scans/gets explicitly. Default: true
replicas Enable region replica testing. Defaults: 1.
splitPolicy Specify a custom RegionSplitPolicy for the table.
randomSleep Do a random sleep before each get between 0 and entered value. Defaults: 0
columns Columns to write per row. Default: 1
caching Scan caching to use. Default: 30

Note: -D properties will be applied to the conf used.
For example:
-Dmapreduce.output.fileoutputformat.compress=true
-Dmapreduce.task.timeout=60000

Command:
filterScan Run scan test using a filter to find a specific row based on it’s value (make sure to use –rows=20)
randomRead Run random read test
randomSeekScan Run random seek and scan 100 test
randomWrite Run random write test
scan Run scan test (read every row)
scanRange10 Run random seek scan with both start and stop row (max 10 rows)
scanRange100 Run random seek scan with both start and stop row (max 100 rows)
scanRange1000 Run random seek scan with both start and stop row (max 1000 rows)
scanRange10000 Run random seek scan with both start and stop row (max 10000 rows)
sequentialRead Run sequential read test
sequentialWrite Run sequential write test

Args:
nclients Integer. Required. Total number of clients (and HRegionServers)
running: 1 <= value <= 500
Examples:
To run a single evaluation client:
$ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1

默认pe是使用MapReduce作业来执行的。
其中参数:nClients可以当做创建多少线程执行测试.
顺序写命令:
hbase pe sequentialWrite 1

说明: 命令行创建一个单独客户端,并且执行持续的写入测试。命令行将一直显示完成的进度直到打印最后的结果,当用户确定客户端服务器负载并不大时,可增加一定数量的客户端(也就是说线程或者MapReduce任务)
hbase pe sequentialWrite 4

顺序读:
命令:
hbase pe sequentialRead 1

随机写:
命令:
hbase pe randomWrite 1

随机读:
命令:
hbase pe randomRead 1

关于Zeno Chen

本人涉及的领域较多,杂而不精 程序设计语言: Perl, Java, PHP, Python; 数据库系统: MySQL,Oracle; 偶尔做做电路板的开发,主攻STM32单片机
此条目发表在NoSQL分类目录。将固定链接加入收藏夹。