【51CTO专访】在微软,有一批叫做“技术院士”的人群。这是微软内部对于一群技术领导者的特别称谓。每一名微软的技术院士,在某项技术上的前瞻性、专长、以及其在业界的领袖地位,等同于微软副总裁对业务的领导力。
前日,51CTO编辑有幸接触到了这样一位微软技术院士,并跟他进行了简短的采访。David Campbell先生在1994年加入微软,当时正逢微软进军企业级软件市场,Dave当时的主要重心放在数据库与存储业务上。2008年开始,Dave将重心转移到Azure、微软云平台和大数据战略上,关注云规模计算、环境数据的价值实现等方面。
Dave此次来上海是以微软亚太研发集团STB中国团队的战略顾问委员的身份,了解微软亚太研发集团在过去一年的工作状态,同时也交流微软对于整个技术发展趋势的观点和看法。
“在微软,我们针对大数据领域强调的是第四个V,就是价值。这些数据你是否能够获得它,并不是最重要的,重要的是你能够从这个数据当中获得价值,这是最令人着迷的地方。”
那么,Dave对于大数据的价值究竟是如何定义的?请看下面的采访实录。
51CTO:Dave你好,感谢您接受我们的采访!您刚才提到大数据的第四个V,价值,那么对于你们在不同行业的客户,您是如何定义这个价值的?
Dave:大数据有极大的潜力。有一个很有意思的现象是,当很多人谈起大数据的时候,他们觉得那只跟互联网相关。而实际上,我们有金融行业,石油勘探,电子商务方面对交易的优化,还有生命科学,医药研究等等很多方面都与之相关。我们在医药领域已经积累了十多年的药物实验数据,还有化学元素之间反应的各种数据。石油和自然资源方面,我们做了很多勘探优化的工作。真的是非常多的领域。还有很重要的一个方面是,我们如何在企业当中增进人们的工作效率?我们谈到社交图谱,企业内部的社交图谱。处在我这个岗位的其他人,是否在关注我所关注的这些信息,还是他们还关注别的东西?很多的行业当中都有很多的机遇。
51CTO:不过当我们谈到传统企业的时候,不是会需要用感应器来收集数据么?
Dave:不一定啊。就好比企业内部的社交图谱吧,我们掌握的信息是整个企业的组织架构,员工们互相发送的邮件,或者在其他通讯工具上发送的信息,等等。都有哪些人在查看BI系统当中的同一份报告?有了这些信息,可以得到许多有趣的见解。
51CTO:这些都是天生的数字化数据。
Dave:是的。整个数字世界的另一个方面是来自企业现有业务的数字内容排放,这些数据现在都很容易收集到,只要观察数据交换的行为,就可以看到谁在与谁进行沟通。有很多可以做的事情。
51CTO:微软是如何帮助客户找到大数据对他们的价值的?
Dave:大数据目前的一个状况还有点像是一群博士生鼓捣一堆软件,弄很多数学论证之类的工作。不过最终,我们有很多人需要每天进行各种决策。我们能帮助他们做出更好的决策么?有种说法叫做 From terrabyte to insight,即使你是用Excel做数据工作的,我们要做的都是使用这些数据来帮助我们做出更好的决策。
51CTO:您有没有看到这些企业在使用数据的时候面临一些限制的情况?
Dave:我个人觉得,最大的限制在于他们需要一些人去“负学习”,才能看到大数据的新价值。比如说,很多企业都建立部署了数据仓库。他们可能会觉得,这就是我的大数据。这只是一个版本的真相。我觉得在这个新的时代,真相是有很多种版本的。数据仓库的数据是一个版本,它们仍然是有价值的。不过真相还有另一个版本,就是社交网络里的人们在如何谈论你的产品。这个版本也是非常重要的。还有一个版本,就是我之前说的,企业的数字内容排放。所以,机会有很多,最大的限制在于有没有人能够后退一步,看到各个版本的真相。
51CTO:最近我们看到一个新的趋势叫做“数据商店”。这有点像是以前那种行业报告的形式,不过现在我们得到的是实时的数据服务。您对这种服务有什么看法?人们消费数据的方式发生了怎样的变化?
Dave:就个人级而言,20年前人们如果关注金融,比如股市或期货,他们通过看报纸获取信息。今天,金融信息都通过网络获取。几年前,我跟一家大型航空公司的人聊天,他们提到想要优化燃油的使用,或者是在风暴袭来之前,是否要将飞机从机场转移到安全的地方。仔细想想,他们要从哪里获取这些信息?他们要从哪里得到这种天气预测的模型,从而决定是否要将飞机转移?这其实涉及到一个数据流动和分析的问题。数据服务将会变得非常非常重要。除了上面说的那种大规模的服务,我认为针对企业内部的服务更加重要。在微软,我们谈“自助BI”谈论了很多年。给人们工具,让他们自己得出见解并做出决策。那其实只是一部分,现在人们会要求更多的信息来源,不仅是企业内的数据,还有企业外部的数据,来帮助我对我这部分的业务形成好的见解。
51CTO:所以,这些数据从哪里获取呢?
Dave:这就是我们现在面临的挑战了。我们的工作之一就是让它们变得可以访问、可以搜索到。之前我提到的内部社交图谱,很重要的一点是,处在我这个位子上的其他人,他们都在使用哪些信息源?有点像是数据服务的推荐引擎。这会让信息更容易被找到,更容易收集。
下一页是采访的英文内容实录。
#p#
51CTO: Hi Dave, thanks for taking our interview! Our first question is, how do you define big data's value to your customers in various industries?
Dave: There is a lot of potential in big data. One of the interesting things we are finding is that many people think about big data is that it is only applicable to Internet scale services. We are seeing a lot of people taking up value, like financial services, like oil detection, digital commerce, optimization of commerce transaction, another state is life science, pharmaceuticals, many cases we've worked out are looking over decades of clinical trial data, to try to find interaction between compounds that we already have to be used in a way. And finally, oil and gas, natural resources, optimization of exploration as well. Really a variety of them. One of the most interesting ones, I think, is that how do we improve productivity of working people in businesses? There is a lot of talks around, such as the social graph, interesting social graphs within businesses. Do other people like me in similar roles consume the same information, lot of opportunities in lot of industries.
51CTO: So when you talk about traditional industries, you need sensors to collect data?
Dave: Not necessarily. If you think about the social graphs inside business. One has a directory inside business that we know how people are related from the organizational structure, and we think about the emails that people send to one another, or other forms of communication, there is really a variety of things. If you look at which people look at the same report in the BI system, we can come up with some very interesting insights.
51CTO: So that's the data that's born to be digital.
Dave: Yes. The other part of the visual thing is the digital exhaust from existing business part. It's quite easy to collect that now. You can just look at who is interacting who, based upon the digital exchanges. There is lot that we can do.
51CTO: How is Microsoft helping customers in realizing the potential of big data, especially for different industries?
Dave: One of the things about big data thus far, is about PhDs working with software who stick together, many of the algebras and proving. But at the end of the day, there are lots of people who are making lots of decisions. Can we help them to make better decisions? Sometimes we are talking this going from terrabytes to insights, we find that a lot of people are just working in tools like Microsoft Excel, how do we get it all the way from sitting in the big data cluster to looking at something and make a better decision.
51CTO: So we need analysis tools to do that.
Dave: Yes.
51CTO: Do you see any constraints those companies are facing? They want to make value of their data, but they are meeting some constraints?
Dave: I think honestly one of the biggest constraints is that they need to have people unlearn things to be able to see big data's new value. I'll give you one example. There are a lot of people who have deployed large scale data warehouses. And they think, that is my big data. That is one version of the truth. I think that in this new world, there are really multiple versions of the truth. The version of the truth you are having in your data warehouse, that's still valuable. But there is a version of the truth about what people are saying in the social space about your product. If you are not paying attention to that, what the social is saying on the web, that is the version of truth that is also important. There is also another version of the truth that can be derived from what I said before, the digital exhaust from existing business part. There is really a lot of opportunity, and one of the constraints if to have people to step back and see the fact that there is great potential for this.
51CTO: Cool. So recently we are having a new thing coming up, called the Data Store. It is something like reports from live data services. What do you think about this kind of services? And how people consume data has changed?
Dave: On a personal level, if you look at people who were tracking finances, stock or equity 20 years ago, they would look at the newspaper. And today there are a lot of web services to get financial data. A couple of years ago, I was talking to a large airline. They wanted to be able to optimize fuel usage, or should there be a storm coming in, should they move their planes to safe areas from the airport. And if you think about where they would get these information, where are they going to get the weather model to tell whether they should move their planes or not, all of this is just about moving data and being able to transform, interprete. Data services will become a very very important thing to do. And that serves at large scale. I think it's more within a business. We have been talking about it for a few years, Microsoft calls it self service BI. Let people have the tools to be able to derive their own insights and make decisions. That is not very well, but now what people ask is where are the information pieces, whether it's data services within the business that allow me to look at the table things together, that give me insights from my parts of the business.
51CTO: How do you find these data, now?
Dave: Today it is a challenge. Some of the technologies we are working on is just to make them available and allow them to be searched. We spoke earlier about the internal graph, imagine, other people in my role, what information source do they use? It's like a recommendation engine for data services. It makes information easier to find and easier to collect.