2018实时大数据Meetup——Spark、Flink、CarbonData的技术与实践

免费报名中

2273 人关注

时间 2018-09-08 09:30 ~ 12:30

地址北京海淀区车库咖啡

活动由华为云&极客邦科技主办

免费报名中

2273 人关注

微信分享

使用微信扫一扫分享到朋友圈

活动分享

使用微信扫一扫进入小程序分享活动

活动详情

一、活动介绍

本次Meetup将解密大数据实时流计算两大引擎Apache Spark(Structured Streaming) 、Apache Flink和社区新贵Apache CarbonData，融合实时数据存储方案。

当前无论是传统企业还是互联网公司对大数据实时分析和处理要求越来越高，数据越实时价值越大，面向毫秒~秒级的实时大数据计算场景，Spark和Flink各有所长。CarbonData是一种高性能大数据存储方案，已在20+企业生产环境上部署应用，其中最大的单一集群数据规模达到几万亿。针对当前大数据领域分析场景需求各异而导致的存储冗余问题，CarbonData提供了一种新的融合数据存储方案，以一份数据同时支持快速过滤查找和各种大数据离线分析和实时分析。

本次Meetup活动集合了来自Databricks、华为及美团的大咖级嘉宾，这些嘉宾长期活跃在Apache开源社区，并担任PMC和Committer关键技术角色，试图以更开放的技术视角为大家解密Spark、Flink、CarbonData的技术和实践。

立即报名：席位有限，先到先得。如有问题，请添加小助手。同时也可以电话联系我们的参赛导师小Q君（ID：geekbang111）添加是请备注meetup。

二、嘉宾介绍

讲师名称副本1.jpg

个人简介：

Tathagata Das is an Apache Spark committer and a member of the Project Management Committee. He’s the lead developer behind Spark Streaming and currently develops Structured Streaming at Databricks. Previously,he was a grad student in the UC Berkeley at AMPLab,where he conducted research about data-center frameworks and networks with Scott Shenker and Ion Stoica

主题摘要：Structured Streaming, a new stream processing engine built on Spark SQL, which revolutionized how developers could write stream processing application. Structured Streaming enables users to express their computations the same way they would express a batch query on static data. Developers can express queries using powerful high-level APIs including DataFrames, Dataset and SQL. Then, the Spark SQL engine is capable of converting these batch-like transformations into an incremental execution plan that can process streaming data, while automatically handling late, out-of-order data and ensuring end-to-end exactly-once fault-tolerance guarantees.

In this session, Tathagata Das will walk through the basic concepts of Structured Streaming and walk-through a concrete example where – in less than 10 lines – you read Kafka, parse JSON payload data into separate columns, transform it, enrich it by joining with static data and write it out as a table ready for batch and ad-hoc queries on up-to-the-last-minute data. We will also take a quick look at event-time aggregations, sessionization operations, and other advanced operations.

讲师名称副本.jpg

个人简介：

蔡强，华为大数据架构师。10余年大数据设计和开发实践经验，负责过多个PB级数据规模的大数据项目。

主题摘要：

1. CarbonData使用及原理介绍

2. What's New in CarbonData?

讲师名称副本2.jpg

个人简介：

毕业于北京科技大学，2017年加入美团点评数据平台，从事实时计算平台建设与维护，致力于打造高效、可靠、易用的基础设施和解决方案。

主题摘要：

随着美团点评业务的快速发展，如何满足日益膨胀的数据时效性、计算场景多样性的需求，成为数据平台新的挑战。在此背景下，我们将为大家介绍Flink作为新一代流处理引擎在美团点评公司的实践与应用。

讲师名称副本3.jpg

个人简介：

时金魁，Scala程序员，华为云技术专家，负责华为云实时流计算服务。曾就职于搜狐和阿里，Spark早期研究者。多年来从事高性能计算和大数据方面的工作，近两年专注于Flink和Spark及周边生态框架的研究和产品落地。

主题摘要：

今年流计算持续升温，在车联网／物联网／交通／ETL／电商／打车／外卖等行业广泛应用，产生巨大价值。开源的流计算框架很多，当前以Flink和Spark为主。华为云实时流计算团队，5年来专注流计算技术，从自研StreamSmart到当前的CloudStream智能流计算，趟坑无数。这次活动会跟大家分享如下内容：

1. Flink／Spark流框架对比

2. 华为流计算技术演进

3. CloudStream服务能力及应用

三、活动日程

时间	主题	演讲人
09:00-09:30	签到	/
09:30-10:10	Easy, Scalable, Fault-Tolerant Stream Processing with Structured Streaming in Apache Spark	Tathagata Das / Databricks，Apache Spark PMC
10:10-10:50	实时融合大数据技术方案Spark+CarbonData	蔡强 / 华为大数据架构师 Apache CarbonData PMC和Committer
10:50-11:00	短休	/
11:00-11:40	美团点评Flink实践与应用	孙梦瑶 / 美团点评高级研发工程师
11:40-12:20	基于Flink，Spark双引擎的实时流计算服务	时金魁 / 华为云技术专家
12:20-12:30	合影留念	/

活动门票

活动筹备中

售票推广中

活动结束

选择票

门票名称

单价(￥)

截止时间

数量

参会票

免费

2018-09-08 12:30

已售罄

此门票需要主办方审核

报名需审核

9月5日新增报名

免费

2018-09-08 12:30

已售罄

此门票需要主办方审核

9月7日新增报名

免费

2018-09-08 12:30

已售罄

此门票需要主办方审核

退票说明：不支持退票

票价

￥ 0

活动已结束

活动主办方

本活动由百格活动提供技术支持