Amazon Web Service exec on competing with Google and Microsoft

It’s been a busy past couple of weeks in the IaaS cloud with Google Cloud Platform and Microsoft each holding major user events and announcing significant advancements, which continues to put pressure on the company many consider to be the market leader: Amazon Web Services.

+MORE AT NETWORK WORLD: Google and Microsoft make their pitch to unseat amazon in the cloud | Jeff Bezos letter to shareholders: At 10 years old, AWS is bigger than Amazon was and growing faster +

Matt Wood

AWS is gearing up for its own international tour of summits this spring and summer, culminating with its re:Invent annual user conference in December. Network World caught up with AWS’s General Manager for Product Strategy Matt Wood – who has a PhD in Bioinformatics - to discuss the increasing competition in the IaaS market and how customers are using some of AWS’s newest features.

NWW: The big IaaS cloud providers – Amazon, Microsoft, Google and IBM – have been talking a lot recently about machine learning, real-time data processing (like Amazon’s Kenesis), and event-driven computing platforms (like Amazon Lambda). Sometimes I wonder if these services are significantly ahead of where customers are in using this technology though. Are these technologies the next big thing for the cloud?

Matt Wood: Well our approach isn’t just to build cool products. Our approach is very much more focused on real material customer feedback.

I tend to look at it in three buckets. There is the ‘you want to know what happened in the past’ bucket. Customers want to ask increasingly complicated questions about increasingly complicated data that has been aggregated from an increasingly disparate set of sources. For customers who want to do this, we have Amazon Redshift, the canonical data warehouse that you can get data into and then run queries against it. (Event-driven computing platform) Lambda is really good at ETL-style workloads, in which you can collect data coming in real-time and then load that data into Amazon Redshift. Lambda’s also good for running data warehousing queries. Plus, we have tools like Amazon QuickSight, which allow you to both visually query that data and share the results of those queries. That’s one bucket: What happened in the past, given all this aggregate information?

Then there’s the ‘what’s happening right now’ view. This is all about real-time streaming data analytics. Lambda plays a really key role here, in terms of accepting a stream from Kenesis and Lambda will respond on the other end. You can just put a real-time data stream into Kenesis and Lambda can handle it.

This is really helpful because Lambda scales, and you only pay for the functions as they run, so it responds to peaks and ebbs and flows in streaming data, and you only pay for it as you use it. Lambda can take that information and process it in a multitude of different ways from a real-time stream. This is really important for IoT, where you want to add as much smarts to existing devices as possible. You can’t run around and install a bunch of over the air updates to these devices because they could be in remote, disparate locations or because they may not have power or network connection.

But Lambda allows you to continually improve the logic and respond to changes in that data using pretty detailed rules across the content of the messages. Data streams and IoT are making a lot of use of data in real time, and tools like Kenesis and Lambda are great at managing all that data.

And then a third bucket is: You want to know ‘what will happens in the future?’ That’s where Amazon Machine Learning plays a role. It allows you to provide very low-latency, high-load, real-time predictions on models you’ve already trained. Lambda can be integrated to execute those predictions based on the Machine Learning API, then send that data back to your apps.

Google’s cloud has been making a big deal recently about its machine learning, big data and cognitive computing functionality. The company recently developed a program that beat highly-ranked players at the ancient game of Go, for example. What would you say to people who see Google as a more savvy big data, machine learning company compared to Amazon and AWS?

If you look at one of the very early screenshots of the Amazon.com gateway page from around 1996, there were just books available but a million titles. But even in 1996 you would have seen a feature called ‘Eyes and Editors’ which was an early foray into using machine learning to help customers navigate a large and growing catalog of products on Amazon retail. That was very early on and of course since then we’ve been using machine learning and artificial intelligence across the company.

Everything from recommendations – people who read this also read this; people who bought this also bought this – to helping guide customers through the very broad catalog that we have and using it for fulfillment. We do a lot of work with fraud detection and prevention; we sponsor two professorships in machine learning at the University of Washington. In addition to all of that, we took all of that internal knowledge and technology and exposed it to customers through AWS with Amazon Machine Learning.

I’m pretty comfortable with the credentials we’ve built up in Machine Learning. We apply those relentlessly across the company for the benefit of customers in helping search, identify and discover, and then helping developers apply those exact same algorithms to the data sets they already have on AWS, allowing them to build out both batch and real-time predictions for their applications. I think our credentials are pretty well established.

One of Amazon’s chief competitors, Microsoft, emphasizes its capabilities in hybrid cloud computing as a strategic advantage compared to AWS. Microsoft has plans to offer a product named Azure Stack, which gives customers an infrastructure stack to run in their own data centers that mirrors the Azure public cloud. Correct me if I’m wrong but AWS doesn’t really have something like that. Is the lack of an on-premises AWS cloud holding the company back from a portion of the market that wants something like that?

I wouldn’t say it’s holding us back. We’re a pretty fast growing business and we’re still growing pretty quickly. We probably have on AWS the most successful and largest collection of real enterprise hybrid applications and use cases of hybrid serving as a migration platform to going all in on AWS.

You can look at a company like Johnson & Johnson running 120 apps, which they expect to triple this year, seamlessly integrated across AWS and on-premises; they call it a ‘borderless’ data center where the apps can run between one and the other. We have folks like Comcast who built their new entertainment platform called X1 as a hybrid application that runs on premises and on AWS; it uses the scale of AWS plus their own internal data centers.

Samsung has hybrid applications, which are deployed across their data center plus the cloud; Hitachi is doing integrated resource management across their hybrid cloud of AWS and on-prem. These are not lightweight, un-thoughtful companies; these are large organizations who are running real, core, mission-critical workloads in a hybrid model across AWS and their data centers.

But in terms of an on-premises appliance that would mirror Amazon’s cloud, Amazon doesn’t have anything like that now. Do you ever get requests for that from customers? I know Amazon always talks about doing what customers are asking for.

That’s right. We don’t have anything like that today, but never say never. We’re going to continue to make investments to allow customers to use their on-premises infrastructure and use AWS. It’s a model that we see both as a way of utilizing existing investment, but also as an early stepping stone to much deeper, more thoughtful migration to AWS.

Customers use a combination of Amazon native tools and their on premises management tools. They use the bridges that we’ve built over the past three or four years, such as ones for identity federation, directory services, integrated networking with Direct Connect; we have plugins for vCenter and Systems Center; we have AWS Config and CloudTrail and even a service like CodeDeploy, which you can run on your own premises. We have things like Storage Gateway, which allows you to build out a gateway internally and then take data internally from your on premises environment and back it up to the cloud either for disaster recovery or for data replication. So there’s a pretty broad set of tools available to customers for building hybrid applications.

When you start looking at migrating data, we have things like a physical storage appliance, which we will send you in the form of our Snowball device that you can load data into, and then send it back to us and we’ll upload it to the cloud. We’re seeing huge uptick in our database migration tool, which allows you to select an on-prem database and then select a new database on AWS, click a couple of buttons, and we’ll manage the migration of that data over from places like Oracle and SQL Server on to open source platforms like MySQL, or increasingly into Amazon Aurora. We’ve already seen over 1,000 databases move since we made the announcement of database migration services, and it went into GA just last month. 

Do companies ever out-grow the cloud? Recently, I spoke with officials at Dropbox, who believe that they can run some of their applications more efficiently in a custom-built internal infrastructure stack compared to using the public cloud. The CTO of Bank of America told me they have found no economic reason to move to the cloud. Especially with open source initiatives like the Open Compute Project, is there a point where it’s more efficient to run some workloads on-premises compared to in the cloud?

Our stated mission and belief continues to be that in the fullness of time, most companies will not run their own data centers. We’ve seen that to be the case for startups that are going through enormous growth. Folks like Airbnb, Instacart and Tinder start on the cloud but then they remain on the platform as they go through that growth and stabilize. We’ve seen large-scale enterprises who have made the choice to make really key, strategic migrations to AWS.

When you see the CIO of GE – which is over a 100-year-old company – saying AWS is their technology platform for hopefully the next 100 years, that’s a good indicator of where large enterprises see things going. And they have a really strategic migration plan of moving 9,000 of their internal applications into AWS. We see large, highly-regulated entities like Capital One talk about how AWS is actually more secure – they were able to realize higher levels of security than they were on premises.

If you think about ‘private cloud,’ you can choose to make an investment of several million dollars and then you can maybe have a circa 2013 AWS service available to you. But that only goes so far because AWS is growing all the time, we’re adding functionality at a pretty strong clip – we’ve already added over 180 features this year – and we’re adding new services and capabilities and instance types. We’re broadening the platform while also deepening the functionality within that platform. It’s really hard to keep up if you’ve already had to make that big capital upfront investment, and then go ahead and try to keep up with the developers in your organization who want to move as quickly as they can but find themselves in this little box frozen in time.

By choosing AWS, you get to take advantage of that pace of innovation, you get to take advantage of the scale at which we operate. Time and again we see customers choosing AWS over on-premises infrastructure and even if sometimes customers experiment with their own infrastructure, they very frequently come back to AWS in the long-term.

But are there some applications and organizations that have the scale and technical in-house expertise that could build a customized infrastructure that is more precisely tuned for the needs of their application?

I disagree with the premise. The idea that there is some highly specialized workload – we have highly specialized tooling to support workloads of that type. Whether you’re going to run high performance computing, or you want to run petabyte-scale analytics, or you want to run real-time streaming data over measurements that are coming from disparate locations at thousands of times per second; we have a platform for that. So I can’t think of a broader set of workloads that fit inside our infrastructure technology platform. And that’s by design. We’ve built the platform so that it could accommodate a very broad set, and that includes highly optimized, highly specialized workloads.

IDG Insider


« Slow growth ahead for IT spending, Gartner says


How to pick the best requirements management tool »
IDG News Service

The IDG News Service is the world's leading daily source of global IT news, commentary and editorial resources. The News Service distributes content to IDG's more than 300 IT publications in more than 60 countries.

  • Mail


Do you think your smartphone is making you a workaholic?