In our previous post, we discussed how every IoT system has a large number of stakeholders. Each seeks a unique set of business and technical outcomes, all of which must be discovered and balanced to create a commercially viable IoT system.
As the owner of the IoT development effort, you will be collaborating with these stakeholders across many departments. A key ingredient for meeting their disparate needs is what we call the IoT data pipeline platform. This article explores reasons why you should include an IoT data pipeline as a common platform component across your company’s IoT initiatives.
Reduce costs and risks
Why build an IoT data pipeline platform? An IoT data pipeline platform encapsulates the infrastructure and software required to ingest and persist IoT data. Then, it makes this data accessible for a variety of consumers and their unique organizational purposes.
Too often, companies start their IoT journey by building multiple, siloed IoT applications. This causes many problems, which we shall discuss. Taking a platform-based approach means that efforts and costs are centralized and shared. Furthermore, a platform-based approach increases consistency and reliability across your enterprise IoT initiatives. This substantially reduces risks and costs by eliminating duplicate efforts, especially around the integration and persistence of data.
Define integration patterns
The inherent fluidity of stakeholder needs will put pressure on the IoT data pipeline platform to be flexible and rapidly adaptable. A key area requiring flexibility is integrating with different external systems and internal business applications such as CRMs and ERPs. Such integrations are paramount for bringing context to raw IoT data, a requirement for producing useful business outcomes. This includes data external to the organization such as price and weather, as well as internal enterprise data including sales numbers and product details.
When IoT applications are built in silos, each team or division must shoulder the cost of developing similar integrations again and again. Each endpoint integration will be developed over and over using distinct methods in different projects, each using whatever method was deemed most attractive by a different developer on every team. This adds risk to each project, wastes valuable resources, and eventually becomes a maintenance nightmare.
Embrace reuse and standardization
In contrast, a platform approach shares the cost by creating reusable components and enforcing standardization. Having the platform define the enterprise integration patterns will more uniformly expose data using industry standards and better support reporting and analytics applications, both in the present and the future.
Evaluate data sources and storage options
Key design questions to keep in mind when evaluating the data sources include:
- Which data needs to be persisted and in what formats?
- What data is already accessible?
- What metadata is needed to contextualize this data?
- What connectors need to be created for currently inaccessible data?
The choice of data store depends on how structured (or not) the data ingested is, volume, throughput, accessibility (hot/cold storage), maintainability, and supportability. Cloud service providers offer many options for data stores such as file (or blob), RDBMS, NoSQL, Time Series, and Graph. However, there’s no silver bullet. IoT data pipeline platforms must access data from multiple data sources to deliver outcomes regardless of the storage method utilized.
Ingest IoT data
Another dimension to the IoT data pipeline platform which deserves some attention is how to ingest IoT data in the first place. New IoT devices may be able to connect directly to cloud endpoints, like Azure IoT Hub or AWS IoT Core. However, many fleets also contain IoT devices which cannot connect on their own, requiring an edge gateway to proxy the connection to the cloud service. In many cases existing equipment cannot be updated to send data to anywhere at all except via hardcoded HTTP post. Yet collecting this data must also be accounted for in your IoT solution architecture and integrated into the overall cloud application.
Clean and transform incoming data
Frequently, ingested data must be cleaned, transformed, and organized before it becomes useful for analytics and driving business goals. For example, temperature or other sensor readings may need to be transformed to use standard units. Data may be dirty or contain missing values which could cause downstream issues with automated processes like notifications or reports. These issues need to be addressed before the data can be safely combined with data from enterprise sources and used for financial purposes like billing and ordering, along with ongoing operational decisions.
Take the next step
Creating a successful IoT data pipeline platform requires strategic, long-term vision while focusing on the tactical, immediate requirements. Managing the complexity that arises when designing and implementing an IoT data pipeline platform is a tall order. If you’re responsible for bringing it all together, let’s talk. For over 10 years, Bright Wolf has helped Fortune 1000 enterprises build highly efficient and resilient connected systems that bring a competitive advantage to their products and teams.
In our future posts, we’ll dive deeper into the many tools available for building your IoT data pipeline platform and considerations for choosing which ones best fit your organization and business needs, starting with an iterative approach to getting better results with IoT data science.