AutomatedBuildings.com Interview

John Petze is a partner and Co-Founder at SkyFoundry, the developers of SkySpark™, an analytics platform for building, energy and equipment data. John has over 35 years of experience in building automation, energy management, and M2M, having served in senior level positions for manufacturers of hardware and software products including President & CEO of Tridium, VP Product Development for Andover Controls, and Global Director of Sales for Cisco Systems Smart and Connected Buildings group. He is also a member of the Association of Energy Engineers. At SkyFoundry he helps facility operators take advantage of advanced operational analytics to create truly intelligent and efficient buildings.

Deploying Data Analytics at the Edge – Continuing our EDGE-ucation series

When we talk about the “edge” or more correctly “computing at the edge,” we are referring to performing essential data acquisition and computation functions as close to the data source as possible.

Sinclair: One of the topics that have generated a lot of interest from our readers is the topic of the “Edge.” It appears that “the edge” is a term that can mean different things. Can you offer your thoughts on how Edge Computing applies to data-oriented applications such as analytics?

Petze: It’s true that the edge is becoming one of those hyped words, but the concepts are very simple, tangible and important to our industry. When we talk about the “edge” or more correctly “computing at the edge,” we are referring to performing essential data acquisition and computation functions as close to the data source as possible. For example, advanced analytics that runs on small IoT devices mounted directly on equipment systems or embedded within the equipment controller.

The concept of edge computing is especially relevant in relation to the focus on the cloud that our industry has seen over the past 5-6 years. Cloud computing provides many benefits but isn’t a panacea for all of the requirements encountered in applying IoT technologies to the built environment. The first generation of analytics applications for data produced by sensors, equipment systems, and IoT devices focused on “computing in the cloud.” This meant that the software applications were based on a requirement to transmit ALL data from equipment systems up to the cloud (or another centralized server) where analytics and the associated generation of visualizations, reports, and notifications would then be performed.

This approach may have been a natural way to start, but it is not viable for the realities of the IoT and control in the built environment. The full benefits of data-oriented applications can only be achieved with an architecture that provides for computing at the edge. By this, we mean solutions that provide data acquisition, storage, analytics and the generation of visualizations at the edge, without any dependence on the cloud.

Sinclair: Can you elaborate on some of the specific challenges that “computing at the edge” addresses?

Petze: I think a good place to start is by thinking about the IoT in general. The IoT is actually a distributed computing challenge. The reality is that it is not possible, cost-effective or desirable to transmit every piece of data from every IoT device to the cloud in order to gain value from that data.

The world we experience every day is a distributed computing world. Think about it for a moment…

Bring up your browser. On your PC, your phone or your tablet…
Go to Yahoo or Google or your favorite site. Look up a subject of interest. Do a search. Boom! There it is. The information you wanted.

How did that happen? Did you upload and store all of the information to your computer first? To your cell phone? To your tablet? No.

Is all of the information aggregated and stored on a single server or somewhere in the cloud? Did someone have to assemble and store it ahead of time in order for it to be to be searchable, accessible, viewable? Obviously, the answer is no.

You request what you want when you want it. You search for what you want when you need it. You subscribe to news feeds that interest you. But you don’t try to aggregate it all in one place. Because you can’t. And, there is no need to.

Nothing on the web works that way. Search doesn’t work that way. When you type in a search, that request is dispatched to hundreds or thousands of computers. They all respond and then their results are shown as if they came from a single server. That is accomplished via a technique known as “map-reduce.”

Yet most first-generation IoT data applications required all data to be sent to the cloud (or another central server) to be aggregated in order to be able to perform analytics and visualization. The reality is that you cannot bring every piece of data from hundreds, thousands, millions or billions of devices to a single server in order to be able to use that data, visualize it, analyze it, present it, and gain value from it. As the industry moves to more and more deployment of IoT devices and use of data-oriented applications this limitation has become very clear.

Consider the example of a self-driving car. We can’t be dependent on sending data to the cloud before deciding to activate the brakes. That data analytics process needs to occur in the vehicle – at the edge. Yet other applications are better served by aggregating data on a central server. Consider how mapping applications collect and analyze GPS data from mobile phones to identify traffic jams and direct us to the best route to our destination. That application is better served by the cloud.

Data analytics solutions need to embrace the highly distributed nature of the IoT and support that with a corresponding software architecture that enables computing to occur where it is most efficient, cost-effective, and reliable. That means an edge-to-cloud-software architecture.

The self-driving car example highlights the “data latency” reasons we need to perform data analytics at the edge, but there are others.

Data Reliability – data collection close to the end devices, whether it be a sensor or controller, increases reliability versus having to connect that data over the Internet to get it to the cloud. Networks do experience outages. Having an edge device that can store data for a short period of time provides a limited solution to that problem. While the data may be stored for transport to the cloud when the network is restored, no actual value (analytics, visualization, control decisions) is generated from that data while the network is down. With a distributed architecture that supports computing at the edge, those processes continue. That means in-building personnel has access to the full capabilities of their systems even if the connection to the cloud is unavailable.

Isolation of Fieldbus Networks – in many cases sensing and control devices communicate via networks that are not designed to go over the Internet or cannot do so in a way compatible with modern IT security requirements - for example, serial networks like RS485 and RS232, local wireless networks, and others. This means that some type of gateway node needs to be installed to isolate those networks and act as a data translator/forwarder. By supporting true computing at the edge, that node can do more than act as a gateway. It can perform the full stack of functions for data analytics, presentation, and control at a similar cost.

Data Transfer Costs and Performance on Constrained Networks – This is one of the hidden costs of centralized cloud solutions. Data transfer to the cloud is not free. You typically see two areas of cost. First, most cloud platforms have charges related to data transfer – they charge based on the amount of data you send to the cloud. Equally important are the costs associated with transmission of data over cellular networks.

[an error occurred while processing this directive]Increasingly, IoT devices are being connected via cellular networks. Sometimes this is done to avoid the challenges of integrating with corporate networks and IT security requirements. In other cases, it is done because no hard-wired network is available, for example, remote monitoring sites, agricultural applications and the like. The costs associated with transferring high volumes of data over cellular networks can significantly impact the economics of a project.

The capabilities of “computing at the edge” change that equation. By computing at the edge, the data is collected locally; analytics are performed locally. The only data that goes across the cellular network are the results. This can reduce network data usage by a factor of 100 to 1 or even 1000 to 1.

Application Reliability and Process Continuity - We have spoken about the reliability of data collection, but there is another aspect of reliability. Consider a remote site with local users of the analytics results. Perhaps they depend on those results to optimize a chiller plant or a production process. With an edge-computing solution, they still have access to their data and analytic results even if the connection to the cloud or central server is lost. In many mission critical applications this is essential.

Sinclair: So does this mean the Cloud is dead?

Petze: While the examples we have just reviewed present some of the very real factors that are driving analytics and similar data-oriented applications to the edge, it’s worth mentioning that processing analytics at the edge does not mean completely abandoning cloud or central servers – they have their place and will continue to do so. It would be a mistake to think of this as one or the other. What computing at the edge means is that the power and benefits of data analytics can be brought to the place in the architecture where they can most effectively deliver value. The Cloud is not dead, but the Edge is now definitely alive.

Contact Information:
John D Petze
Principal, SkyFoundry
john@skyfoundry.com
804-545-3116

More information on SkySpark® analytics is available at www.skyfoundry.com

Events	Want Ads
Our Sponsors	Resources