|A simple question: where will all of this data get stored?|
Hortonworks, Cloudera Players in Future of Self-Driving Cars, Says Cowen
Hortonworks, Cloudera and Teradata are among names that could benefit from an "explosion" of data in connected vehicles in coming years, say analysts with Cowen & Co. in a 242-page think-piece on autonomous vehicles.
By Tiernan Ray
Sept. 25, 2017 11:18 a.m. ET
Cowen's analysts project the "explosion" in data from connected cars.
Cowen & Co.’s analysts today published a mammoth, 242-page group report on the future of electric vehicles, one of whose takeaways is that there will be lots of work for software makers such as Hortonworks ( HDP) and Cloudera ( CLDR), purveyors of the open-source Hadoop data mining technology.
After that, however, artificial intelligence becomes the next goal, and the group see a field with lots of competing approaches and apparently no clear winner.
The authors, 19-strong, start with the premise that purchases of electric vehicles will be stronger in the next few years than the market currently expects, albeit still a fraction of annual auto sales:
We see an inflection of electric vehicle (EV) adoption in the 2025 to 2030 time frame as vehicles become economical on an unsubsidized basis. We see the start of the “hockey stick” in demand starting in 2018 though due to electric cars being more widely available, having a “cool” factor more broadly than just the Tesla does today, as well as economic ownership as EVs becoming cheaper than internal combustion engines (ICEs). We see global EV penetration hitting 1% in 2017 and rising to 3.1% in '20 and 7.5% in '25. Most other forecasts call for about 2.5% penetration in' 20 and 5% in '25; however, estimates have been creeping up. While we expect a sharp acceleration of growth in EVs in the coming years, we note that our '20 forecast still has 98% of vehicles sold using some form of an internal combustion engine, which includes hybrid solutions as well as plug-in hybrid electric vehicles (PHEVs).
Software analyst J. Derrick Wood offers his thoughts on A.I., with the premise that the “hundreds” of sensors and other chips in each vehicle will cause data “volumes to explode."
"This is already true,” writes Wood, "with Formula 1 cars which create 36TB of data per race, generated from >100 sensors that are distributed throughout the car, collecting data about the braking system, tire pressure, engine, calibration, temperature and much more.” He cites one research house, Datameer, stating that a self-driving car will generate 1 terabyte of data per hour.
Wood offers a forecast for the annual volume of data, which shows it reaching 1.5 million petabytes of data annually by 2020, as shown in the chart at the top of this post.
All that data grow will lead to car makers becoming “large consumers of data/analytics, and even applications software."
Among software that will gain popularity are Apache Spark and Apacke Kafka, two tools for “stream processing” to hand the flood of data coming from the sensors, writes Wood.
The next stage, he writes, is batch processing of all the data that is stored:
Vendors serving these kinds of applications span the data management stack, including data integration platforms (MuleSoft, Talend, Alteryx), Hadoop platforms (Cloudera, Hortonworks, MapR, Databricks), NoSQL platforms (MongoDB, DataStax, Couchbase, MarkLogic), SQL relational platforms (Oracle, SAP, IBM, Microsoft, Teradata), Cloud platforms (AWS, Azure, GCP) and BI tools (Tableau, Power BI, Domo).
And that will lead to Hortonworks and Cloudera, along with privately held MapR, and Teradata ( TDC) becoming tools vendors to automakers for the latter to offer “new application and content services."
"These service providers will want to analyze usage behavior leveraging high-frequency feedback data to optimize their content & services and extend their omni-channel initiatives."
Already, notes Wood, four of the top five automakers are using Hortonworks’s software. He describes some of the capabilities:
The Hortonworks Data Flow (HDF) platform supports bi-directional data communication between an on-vehicle platform and the cloud (known as its data-in-motion platform). It communicates sensor and telematics control unit data in real-time such as speed, geo-location and airbag deployment. HDF has an intelligent agent that runs on embedded devices in the car, and it uses data filtering and prioritization to determine the most crucial data sets to communicate between the car and the cloud. The Hortonworks Data Platform (HDP) manages the data-at-rest for storage, security, operations, analytics and machine learning.
The “key to bringing it all together,” writes Wood, is A.I. Wood then goes into a lengthy explanation of A.I. techniques, such as “machine learning.”
We have seen major technology initiatives from Uber, Lyft, Google, Ford, GM, BMW, Audi, Nissan, Volvo, Daimler, Renault, Toyota, Tesla, Navistar and many others. In fact, GM, Volvo, Nissan, Ford and Tesla all have plans to have achieved full autonomy into its vehicles at some point over the next several years, including Nissan in 2020 and Ford in 2021. Google created a new company, Waymo, for focusing on driverless vehicle technology.Wood notes that automakers have their own initiatives for A.I. and autonomous driving, and that […] Companies like Google (Tensorflow), Microsoft (CNTK) and AWS (DSSTNE) have open-sourced their Machine Learning IP and now ML algorithms are widely available to be used to build an AI platform. In turn, there are a plethora of start-ups building their own proprietary algorithms and services to evangelize the growing market opportunity. The automotive industry has widely embraced algorithms to analyze the vast amounts of data streaming from test vehicles, many of which are capturing 1000s of GB of data per hour from cameras and other sensors throughout the car.