亚洲首届Apache HBaseCon Asia 2017日程

新闻
HBaseCon大会是Apache HBase™官方举办的技术会议,发起于2012年。Apache HBase是基于Apache Hadoop构建的一个分布式、可伸缩的KeyValue数据库,它提供了大数据背景下的高性能的随机读写能力,它的实现参考了Google在2006年发布的Bigtable论文。

 大会时间2017.08.04 08:00-18:00

大会地点中国·深圳市龙岗区坂田街道环城路天安云谷1期3栋D座3楼国际会议中心

参会对象:开发者

演讲主题简介

Keynote

HBase 2.0.0
Michael Stack

HBase-2.0.0 has been a couple of years in the making. It is chock-a-block full of a long list of new features and fixes. In this session, the 2.0.0 release manager will perform the impossible, describing the release content inside the session time bounds.

HBase Practice At XiaoMi
Zheng Hu

We'll share some HBase experience at XiaoMi:
1. How did we tuning G1GC for HBase Clusters.
2. Development and performance of Async HBase Client.

 

Track1

Offheap bucket cache success story and Offheaping the write path in HBase
Ramkrishna Vasudevan and Anoop Sam John

The first part of the talk covers the success story of deploying the latest improvements to offheap mode bucket cache in one of the biggest clusters at Alibaba.
It highlights how off heap read from bucket cache helped in improving the avg QPS and avoided the frequent dips in QPS due to GC.
The second part covers the efforts that went into making the HBase write path to effectively use the offheap memory, various design changes in terms of size accounting and the performance gains that we achieved at the end of the task.

HBase Multi tenancy use cases and various solution
Bhupendra Jain

In a multi tenant scenario the biggest challenge is to achieve the QoS for each tenant without impacting the other tenants workload. This session will talk about the multi tenancy use cases and challenges present in HBase. Session will talk in detail about
a) Achieving Multi tenancy with Single HBase cluster - Solutions, Pros and cons (RS Group, RPC Throttling, Quota etc.)
b) Achieving Multi tenancy with multiple HBase cluster - Solutions, Pros and cons.

Lift the ceiling of HBase throughputs
Yu Li and Lijin Bin

HBase is the core storage of Alibaba's search infrastructure and meets big challenge on improving its throughputs, which decides the speed of machine learning program processing thus the accuracy of recommendations made. In this session we will talk about work done and in progress to increase both read and write throughputs, as well as the real performance on the past Singles' Day and latest benchmark data in laboratory.

Removable singularity: a story of HBase upgrade in Pinterest
Tianying Chang

HBase is used to serve online facing traffic in Pinterest. It means no downtime is allowed. However, we were on HBase 94. To upgrade to latest version, we need to figure out a way to live upgrade while keeping Pinterest site live. Recently, we successfully upgrade 94 HBase cluster to 1.2 with no downtime. We made change to both Asynchbase and HBase server side. We will talk about what we did and how we did it. We will also talk about the finding in config and performance tuning we did to achieve low latency.

HBase Disaster Recovery Solution at Huawei
Ashish Singhi

HBase Disaster recovery solution aims to maintain high availability of HBase service in case of disaster of one HBase cluster with very minimal user intervention. This session will introduce the HBase disaster recovery use cases and the various solutions adopted at Huawei like.
a) Cluster Read-Write mode
b) DDL operations synchronization with standby cluster
c) Mutation and bulk loaded data replication
d) Further challenges and pending work

Backup / Restore feature in HBase
Vladimir Rodionov and Ted Yu

Backup and restore functionality is crucial to achieving fault tolerance for data management systems.
In the talk, we are going to cover the newly merged backup and restore phases 2 and 3.
Previously users can perform snapshot for backing up data. However, the associated execution cost may be high due to the flush across region servers. There was no incremental snapshot either.
Backup and restore functionality provides two types of backup:
Full backup – foundation for incremental backups
Incremental backup – can be periodic to capture changes over time
We'll cover three types of backup strategies:
Intra-cluster backup
backup on a separate HDFS archive cluster
backup involving Cloud or a Storage Vendor
Best practices for Backup-and-Restore will be presented next.
We'll explain concepts such as Backup Image, Backup Set with example commands of how they are used.
Mechanism for Incremental backups is covered next.
Finally we'll cover bulk load support for backup.

HBase on Beam
Jingcheng Du

Apache Beam is an open source and unified programming model for defining batch and streaming jobs that run on many execution engines, HBase on Beam is a connector that allows Beam to use HBase as a bounded data source and target data store for both batch and streaming data sets. With this connector HBase can work with many batch and streaming engines directly, for example Spark, Flink, Google Cloud Dataflow, etc. In this session, I will introduce Apache Beam, and the current implementation of HBase on Beam and the future plan on this.

 

Track 2

HBase: recent improvement and practice at Alibaba
Wenlong Yang and Han Yang

AliHB, a tailored HBase branch for Alibaba Group's business characteristics and requirements, is widely used as a basic storage service to support the online and nearline applications of whole alibaba economy companies, like taobao.comtmall.comalipay.comcainiao.com and etc.
In this talk, we will share the experience of high availability and low cost to maintain the clusters including more than ten thousand nodes:
1. Several typical scenes introduction at Alibaba
2. SQL(based on Apache Phoenix) improvement
3. Range-level data copy feature cross clusters
4. Prefix-Bloomfilter for scan performance
5. Dual-Service based on async api, enabling concurrent access on two clusters for expected low latency
6. Some useful things for production.

Ecosystems with HBase and CloudTable service at Huawei
Jieshan Bi and Yanhui Zhong

CloudTable: Huawei's cloud HBase service will be online.
1. Our view on HBase.
2. CloudTable service based on HBase.
CTBase: A light-weight HBase client for structured data.
1. Schematized table, more friendly for structured data storage.
2. Global secondary index for HBase.
3. HBase Query DSL. JSON based light-weight API.
4. Cluster table. Pre-joining with keys, a better solution for cross-table join queries from HBase.
Tagram: Distributed bitmap index implementation for HBase.
1. Distributed bitmap index for accelerating AD-HOC queries with low cardinality columns.
2. Powerful and flexible query API.
3. Tagram offers millisecond-level query latency.

Large scale data near-line loading method and architecture
Shuaifeng Zhou

When we do real-time data loading to HBase, we use put/putlist interface. After receiving put request, regionserver will write WAL, write data into memory store, flush memory store to disk-store, then compact files again and again. That precedure occupies too much resource and causing read/write performance decrease. To solve the problem, we provide a kind of near-line loading method and architecture, greatly increase the loading bandwidth, and decrease the influence to read operations.

HBase at JD
Xingbo Peng, Nan Zhang and Bang Wen

1.规模现状
HBase
在京东CTO体系中经历了数年的发展,集群规模已经达到3000+台,支持了京东600+业务系统,京东CTO体系的HBase集群,已经经历了多次618和双11的考验。京东CTO体系是HBase的重要用户。
2.应用的业务场景
介绍HBase在京东的典型应用的业务,包括监控、风控、推荐、广告等
3.高可用改进
介绍我们在HBase集群高可用方面做的一些工作,包括跨机房容灾、多租户-资源分组、集群安全等
4.运维实践
主要介绍我们在HBase集群运维上的一些实践,包括:HBase集群监控系统Mummut、报警系统、HBase集群与大数据平台结合、业务运营及数据迁移等
5.未来展望
介绍我们正在基于HBase做的及未来要做的一些工作,包括:kylinphoenix和容器化部署等

Synchronous replication for HBase
Shen Chunhui and Meng Qingyi

This talk will share the detailed implementation and actual practice about synchronous replication between clusters on alibaba's internal HBase branch.
It contains the content of how to keep the data consistency, how to switch the client access between clusters automatically, the related perfomance and monitor.

基于HBase的企业级大数据平台
Xinyu Zhang, Xueliang Chen and Zheng Fan

基于HBase的大数据平台已经成为中国人寿新一代综合业务处理系统中非常重要的基础性数据平台。目前基于该平台已经整合了上百TB的数据,并将几亿客户的客户、业务、接触数据整合到一个统一的数据模型中,并基于此形成了上千个客户标签。同时,基于该平台为客户、营销员和内部管理人员提供了销售支持、客户服务、运营支持等多类应用。通过APP、网页等形式提供了多种信息的检索和查询,并通过深度学习模型提供了反欺诈等方面的数据应用。

HBaseHulu的使用和实践
Qianxi Zhang

1. Hulu是美国最受欢迎的在线视频网站之一,Hulu BeijingHulu第二大研发中心。北京大数据基础架构团队负责整个公司的大数据基础架构的研发和运维。
2. HBase
Hulu的概况
3. HBase
Hulu的使用
4.
用户画像系统,存放所有用户的基本信息,用户行为,第三方DMP数据和机器学习结果标签(几十万个Qualifier)SparkSpark Streaming读写HBase数据,运行各种机器学习模型,为公司的视频推荐,精准广告和Marketing团队服务
5. HBase
Hulu的优化

Apache HBase at Netease
Xinxin Fan and Hongxiang Jiang

First, we will give a brief introduction about the HBase service at Neteaseinclude the basic cluster info and the key HBase service. And then we will talk same tips about the tuning practices for HBase. Last, we will introduce some improvements at the internal HBase version.

Building online HBase cluster of Zhihu based on Kubernetes
Zhiyong Bai

As a high performance and scalable key value database, Zhihu use HBase to provide online data store system along with Mysql and Redis. Zhihu’s platform team had accumulated some experience in technology of container, and this time, based on Kubernetes, we build flexible platform of online HBase system, create multiple logic isolated HBase clusters on the shared physical cluster with fast rapidand provide customized service for different business needs. Combined with Consul and DNS server, we implement high available access of HBase using client mainly written with Python. This presentation is mainly shared the architecture of online HBase platform in Zhihu and some practical experience in production environment.

 联系方式:

 

关于大会的商务/赞助咨询:张冲 zhangchong1@huawei.com

责任编辑:KOL
相关推荐

2017-08-09 13:25:20

阿里Asia

2018-07-25 11:33:28

Apache

2010-09-02 10:51:15

MeeGo开发者日MeeGo

2022-04-25 10:34:19

云原生直播

2015-05-26 16:31:25

CES

2013-05-16 09:52:47

域名

2017-12-25 10:54:05

2011-10-20 14:48:12

亚洲移动通信博览会GSM

2017-12-21 11:42:19

存储

2017-07-17 11:18:50

互联网

2017-03-22 17:29:09

亚洲 论坛

2012-06-20 14:08:27

创业大赛

2014-08-22 17:23:44

2019-03-27 18:26:01

网络安全网络安全技能大赛

2012-11-13 11:05:50

数据中心DC日企

2017-06-06 11:17:40

互联网

2017-06-06 09:46:46

互联网

2017-05-18 14:08:06

互联网

2017-03-30 11:34:07

国际大数据产业
点赞
收藏

51CTO技术栈公众号