Transformation of network softwarization towards 5G inherently requires satisfying the requirements across a broad scope of verticals while maintaining Quality of Service (QoS) and Quality of Experience (QoE) criteria required to satisfy various network slice constraints. This session with hands-on lab introduces 3 key elements of service assurance – Monitoring, Presentation & provisioning layers and introduction to various cloud-native open source frameworks like Collectd, Influxdb, Grafana, Prometheus, Kafka and Platform for Network Data Analytics (PNDA).
3. 3
Acknowledgementsto
• Tim Verrall
• John Browne
• Damien Power
• Emma Collins
• Jean-Christophe Bouche
• Jim Greene
• Krzysztof Kepka
• Jabir K Kadavathu
• Michal Kobylinski
4. 4
Agenda
• Service Assurance
• Monitoring & Metrics
• OPNFV Barometer
• Integration & Provisioning
• Prometheus
• Kafka
• ONAP & VES
• PNDA
• Fitting Together
5. 5
WhatisServiceAssurance
The application of policies/processes to ensure
that services offered over networks meet a pre-
defined service quality level for an optimal user
or subscriber experience.
SA Technologies enable to monitor FCAPS
(Fault, Configuration, Accounting, Performance
& Security) attributes on existing network
infrastructure
Figure: Service Assurance mapped to ETSI model
6. 6
ThreekeyelementsofaServiceAssurancePlatform
Monitoring: Enabling deeper management and tracking of specific service levels
– Platform & Network counters to track usage and performance to configured parameters
Presentation: Reporting to enable reaction to service level changes:
– Support for the detection of trending against configured parameters and the enabling of capacity plan
changes based on those trends
Provisioning: Enable configuration of service levels based on workload or service priority:
– Allocate or partition platform resources such as CPU, memory, cache, and network bandwidth
7. 7
ServiceAssurance“Phased”EvolutionforNFV/SDN
Phase 1 - Equivalence (Virtualized + Interworking with existing management systems)
Phase 2 - Automated by MANO+SDN Controller
Phase 3 - Predict failures and adapt automatically
Platform Service Assurance
- Equivalence
• Platform Service Assurance supporting:
•Intel RAS Technologies
•Cache Config & Monitoring
•Bios Config & Reporting
•Fastpath DPDK Interface Reporting
•Fastpath DPDK Keep Alive
•Virtual Switch Health
•Host Health
• …….
Platform Service
Assurance (MANO +
SDN Controller)
•VIM and above, support:
• Enable RAS Technologies
• Enable Watchdog Metrics
• Enable DPDK and Keep Alive
• Enable Host Health
• Policy Based Provisioning
• …
Predictive Platform
Service Assurance
•Predict Failures and Adapt
Automatically:
• Automated and Adaptive to
changes notified in metrics
• Closed loop and Dynamic SA
environment
•
Phase 1 Phase 2 Phase 3
If you can’t measure and control the underlying platform resources, it is hard to
measure, monitor and guarantee services running on that infrastructure
8. PlatformObservability&ServiceAssurance(SA)
• Observability: Ability to expose state of the platform to ensure Service Level
Objectives are met
• Observability Considerations: Logging, Metrics & Tracing
• Communications Service Provider Context:
• Care about overall Service Assurance
• Both Monitoring & Observability are important
• Service Assurance encompasses aspects of Observability
11. 11
CollectdMonitoringAgent
Collectd: Why & What
• Statistics collection daemon
• Uses read or write plugins to collect metrics write to an end
point
• Open source
• Widely adopted
• Configurable Collection Interval
Various Plugin types:
• Input/Output
• Binding Plugins
• Logging Plugins
• Notification Plugins
• Other: Network plugin with both send/receive feature
Figure: Collectd Architecture
https://github.com/collectd/collec
td
12. 12
PlatformTelemetryExposure&Integration
Compute Network Storage
Hypervisor [RT/SA KVM4NFV extensions]
NFVI
IPFIX
Virtualised
Compute
Virtualised
Network
Virtualised
Storage
E.g.
Working/Protect
Failover
Local
Corrective
Action
Enterprise
MIB
SYSLO
G
Collectd
PMU^
counters NIC
counters
vSwitch
counters
SNMP API
Perfmon
MIB
Common / Standard Open APIs
Fast Path
Triggers on events or
counters
VM Stall
Detection/
RT Stall Detection
Monitoring/
Analytics
Systems
Slow Path
Periodic Pull 1/15mins
RAS Hypervisor/Contain
er Counters
Container
Monitoring
Solutions
(Prometheus
….)
Includes
NetFlow Collectors
Vendor SA
Middleware
Intel® Node
Manager
NFV Platform
MIB
Standard Open APIs
Intel Components
Open Platform
Collectors
Intel® Run Sure Technology
MCA* PCIe AER
Resilient System Technology
Resilient Memory Technology
SDDC DDDC+1 Mirroring
RAID/
NVMe*
Intel® Rapid
Storage
Technology
sFlow
Intel®
Management
Engine
IPMI
Ceilometer
Aodh
Vitrage
Congress
In progress
Done/Integrated
OpenStack*
Collectd PluginsIntel® Infrastructure
Management Technologies
Gnocchi
VES Plugin
Redfish
C
M
T
Intel® RDT
C
A
T
M
B
M
C
D
P
P
O
W
E
R
Out Of
Band
Telemetry
Kafka Prometheus
OpenStack*
VIM
PMU^: Performance Monitoring Unit
13. 13
PlatformTelemetryOptions-Southbound
Plugin Description
Intel RDT Plugin A read plugin that provides the last level cache utilization and memory bandwidth utilization
Huge Pages Plugin Huge pages plugin allows the monitoring of free and used hugepage numbers/bytes/percentage on platform.
vSwitch Stats Plugin A read plugin that retrieves interface/link stats from OVS.
vSwitch Events
Plugin
A read plugin that retrieves events (like link status changes) & liveliness from OVS.
IPMI Plugin A read plugin that reports platform thermals, voltages, fan speed, current, flow, power etc. Also, the plugin monitors
Intelligent Platform Management Interface (IPMI) System Event Log (SEL) and sends appropriate notifications based on
monitored SEL events.
Virt Plugin (Libvirt) A read plugin that uses virtualization API libvirt to gather statistics about virtualized guests on a system directly from the
hypervisor, without a need to install collectd instance on the guest.
DPDK Stats Plugin A read plugin that retrieve stats from the DPDK extended NIC stats API.
DPDK Events Plugin A read plugin that retrieves DPDK link status and DPDK forwarding cores liveliness status (DPDK Keep Alive).
RAS Memory Plugin A read plugin that uses mcelog to check for memory Machine Check Exceptions and sends the stats for reported exceptions
PCIe AER plugin A read plugin that monitors PCIe standard and advanced errors and sends notifications about those errors
Note: Not an exhaustive list
14. 14
PlatformTelemetryOptions-Southbound
Plugin Description
DPDK Stats Plugin A read plugin that retrieve stats from the DPDK extended NIC stats API.
PMU Plugin A read plugin that collects performance monitoring events supported by Intel Performance Monitoring Units (PMUs). The
PMU is hardware built inside a processor to measure its performance parameters such as instruction cycles, cache hits, cache
misses, branch misses and many others.
Log parser Plugin A read plugin that uses mcelog to check for cpu, IO, QPI or system Machine Check Exceptions and sends the stats for
reported exceptions
RedFish Plugin A read plugin that collects metrics available via redfish endpoints, e.g. in RSD architecture.
Storage (RAID) Plugin A read plugin responsible for gathering the events from RAID arrays that were written to syslog by mdadm utility.
SMART Plugin A read plugin that gathers Self-Monitoring, Analysis And Reporting Technology (SMART) data from block devices, primarily
adding support for NVMe devices.
DataCenter Persistent
Memory Plugin
Provides metrics from Intel DataCenter persistent memory
Power plugin
enhancements
Added metrics for power and frequency plugins
• CPU Freq Plugin: # of p-state (CPU freq) transitions & time spent in each p-state
• Turbostat plugin: p-states enabled/disabled, Turboboost enabled/disabled, Platform Thermal Design Point, Uncore bus
ratio
Note: Not an exhaustive list
15. 15
PlatformTelemetryOptions-Northbound
Plugin Description
Gnocchi Plugin A write plugin that pushes the retrieved stats to Gnocchi. It’s capable of pushing any stats read through collectd to
Gnocchi, not just the DPDK stats.
Write_kafka plugin A write plugin that provides the metrics to Kafka
Write Prometheus Plugin Provides data to Prometheus than the collectd-exporter
Aodh Plugin A write notification plugin that pushes events to Aodh, and creates/updates alarms appropriately.
SNMP Agent Plugin A write plugin that will act as a AgentX subagent that receives and handles queries from SNMP master agent and returns
the data collected by read plugins. The SNMP Agent plugin handles requests only for OIDs specified in configuration file.
Supports SNMP: get, getnext and walk requests. SNMP write plugin is not supported by platform team.
AMQP1 plugin plugin to send metrics and events via amqp1 bus
Network Plugin Sends metrics to connected nodes
write-graphite widely used plugin to store metrics in graphite database
Note: Not an exhaustive list
16. 16
Barometer Strategy:
• Ensure platform metrics/events are
accessible through open industry standard
interfaces.
• Demonstrate platform & network
technologies can be monitored, consumed
and actioned in real time
Opnfvbarometer
One Click Install:
Easy install/configuration
for customers
One command to install
Collectd/Influxdb/Grafana
• Three container approach for
Collectd:
• Stable Container: latest stable branch
• Master Container: up to date with
master
• Experimental Container: cherry pick
features of interest
17. • Easier to deploy
• Standard environment
• Scalability
Collectd&Barometermicroservice
Reference container images are hosted @
https://hub.docker.com/r/opnfv/barometer-collectd/
18. 18
Collectd&BarometerMicroservice
Containerisation with Ansible support:
• Installs Collectd, Influxdb , Grafana, Kafka & VES
containers
• Easier installation, configuration, collection and
visualization of the NFVI Metrics.
• Support a HA and a non HA deployment.
• Speed up deployment of collectd by providing
golden images.
Openstack kolla also builds containers based
on collectd & configurable through Ansible
Automation
• OPNFV CI ensure successful
Barometer deployment with
OPNFV installers
• Supports Apex & will be
adding Compass support
Fastest way to Introduce Platform Telemetry to ‘Your’ Infrastructure
19. 19
EarlyAdoptionofIAFeatures–Upstream&Downstream
• Showcase IA feature’s telemetry via OPNFV
Barometer upstream
• Three container approach for Collectd:
• Stable Container: latest stable branch
• Master Container: up to date with master
• Experimental Container: cherry pick features of
interest
• Downstream IA specific plugins via Red Hat
OpenStack Platform
Experimental:
Latest & greatest
of IA Metrics
Master: Latest
accepted by the
community
Stable: Latest
stable release
22. 22
NsbprovidingAI/MLdata-sets
NSB framework used to run
test cases over varying
intervals on a commercial
EPC or similar use cases
Barometer used to set up
InfluxDB and Collectd
containers.
NSB
Compute StorageNetwork
Context
Bare Metal StandAlone Openstack
Traffic
generator
Commercial
VNF
Sample VNF
Test
Case
(s)
HTML
Report
DashB
oard
NFVi
App
Network
NFVi
Collectd pushes the platform metrics to InfluxDB while the test cases are been
executed.
The metrics from the VNF, traffic generator, and platform are all converted to csv
and sent to the data scientists.
23. 23
• Open-source systems monitoring and
alerting toolkit
• Pulling model:
• Collectd native plugin
• Prometheus exporter
• Red Hat Service Assurance Framework uses
AMQP1 to push metrics to prometheus
collectd
or exporter
Barometer container
Img src: https://prometheus.io/docs/introduction/overview/
24. 24
The telemetry framework
is a dynamic application
running atop OpenShift
(Kubernetes) using
several components such
as Prometheus, the Smart
Gateway, collectd and the
Apache QPID Dispatch
Router
Github:
https://github.com/redhat
-service-assurance
Red Hat Telemetry Framework
Source: https://telemetry-framework.readthedocs.io/en/master/overview.html
26. 26
• Addresses need for common
global scale orchestration &
automation platform for
Telco, Cable & cloud
operators
• Framework that allows
specification of service in all
aspects – policy, control,
behaviour, analytics, closed
loop, etc.
Img src: https://www.onap.org/wp-content/uploads/sites/20/2018/06/ONAP_CaseSolution_Architecture_0618FNL.pdf
Figure: ONAP Architecture
27. 27
• VES provides converged event stream format
to simplify closed loop automation
• Reduces effort to integrate VNF telemetry
• Integrate platform & VNF telemetry into
automated VNF management systems, like
DCAE
• Convergence to a common event stream
format and collection system
• Feeds VES collector in DCAE with unified data
VNFEventStream(VES)
Img src: https://wiki.opnfv.org/display/fastpath/VES+plugin+updates
28. 28
It’s a
• Messaging System
• Pub-Sub Model
• Fault tolerant
Why Kafka:
• Build real-time streaming data
pipelines
• Build real-time applications to react
to streaming data
TopicKafka
Replicas – copies of partitions
Brokers – maintains published data (kafka
server)
Zookeeper – manages kafka brokers & notifies
producer/consumers
Cluster –
• more than one broker
• Manages persistence & replication of message
data
30. PlatformforNetworkDataAnalyticsPNDA.ioOverview
Simple, scalable open data platform
Provides a common set of services for
developing analytics applications
Accelerates the process of developing
big data analytics applications whilst
significantly reducing the TCO
PNDA provides a platform for
convergence of network data analytics
PNDA
Plugins
ODL
Logstash
OpenBPM
pmacct
Telemetry
Real
-time
DataDistribution
File
Store
Platform Services: Installation, Mgmt,
Security, Data Privacy
App Packaging
and Mgmt
Strea
m
Batch
Processing
SQL
Query
OLAP
Cube
Search
/
Lucene
NoSQL Time
Series
Data
Exploration
Metric
Visualisation
Event
Visualisation
PNDA Managed
App
PNDA Managed
App
Unmanaged
App
Unmanaged
App
Query
Visualisation
and Exploration
PNDA
Applications
PNDA
Producer API
PNDA
Consumer API
32. 42
Apache Avro
Language neutral data serialization system
Provides rich data structure for formatting
Stores the data definition in JSON format making it easy to read and interpret
The data itself is stored in binary format making it compact and efficient
Supports schemas for defining data structure
34. 34
• Can integrate east/west
with MANO systems
• Collectd data ingestion
goes through kafka
topics
collectd
Img src: https://wiki.onap.org/display/DW/ONAP+Beijing+Release+Developer+Forum%2C+Dec.+11-13%2C+2017%2C+Santa+Clara%2C+CA+US?preview=/16002054/20874945/Telemetry-Analytics-ONAP-11Dec2017.pdf
38. 38
ClosedLoops–NetworkingStack
Application Layer
Network Data Analytics
Orchestration, Management, Policy
Cloud & Virtual Management
Network Control
Operating Systems
Data Path
Hardware/
Disaggregated Hardware
ServicesManagement&ControlInfrastructure
Micro-seconds/
Milliseconds
Mins/Hours/Days
Closed Loop
Reaction Time
Domain Knowledge
Local to
Platform
End to End
Enforce Local
Policy
Deployment
Policies
Enforce Network
Domain Policy
Map Policies
HW Enabled
Loops (eg RAS)
Enforce DP
Loops (HA etc.)
Analyze/
Plan Policies
High Speed Control Loops are Close to the Platform
Seconds/Mins
39. Analytics
39
Closed Loops – Business Cases
Improved Customer
Experience
Cloud Optimization
& Efficiency
Edge Placement
Service Healing
Differentiated QoS
Service Optimization
Energy Optimization
Capacity Optimization
Cloud Configurations
Business
Use Cases
AI/ML/DL
Platform(s)
Feature Exposure Provisioning Telemetry
Local Policy Enforcement Agent(s)
For Local Dynamic Control
Intel® Infrastructure
Management Tech
Intel®
RDT
Power
Monitoring/Storage
NFV Orchestrator (NFVO) [eg ONAP/OSM]
Security
Threat Detection
Threat Response
Business Applications
collectd
Policy Based Provisioning
Control Loops
VNF Manager (VNFM)
OpenStack* Kubernetes* Telemetry
I/F
Telemetry
I/F
Intel® Run Sure
Technology
Bare Metal
Telemetry
I/F
42. Barometer Links
Barometer Home: https://wiki.opnfv.org/display/fastpath/Barometer+Home
Collectd advantages, etc.:
https://wiki.opnfv.org/display/fastpath/Collectd+advantages%2C+disadvantages
+and+a+few+asides
Collectd integration with Prometheus:
https://wiki.opnfv.org/display/fastpath/Collectd+integration+with+prometheus
Metrics/Events through Barometer (not on Collectd site):
https://wiki.opnfv.org/display/fastpath/Collectd+Metrics+and+Events#CollectdM
etricsandEvents-Metrics
43
43. 44
• Industry Standard Software Defined
Management for Converged, Hybrid
IT
• REST API / HTTPS / JSON
• Providing among others ability to
collect OOB telemetry
• v1.0 – power, temperature, fan speed
• Last release 2018.2
• Eventing (Metric Reports)
Redfish
Src: https://www.dmtf.org/sites/default/files/2017_12_Redfish_Introduction_and_Overview.pdf
44. 41
• Read plugin for OOB telemetry
• Configurable by collectd.conf
• Queries – list of redfish path definitions to
metric collections
• Services – list of endpoints to send requests
for chosen set of queries
• Plugin future direction (wip)
• Extended Telemetry
• Eventing mechanism (TelemetryService)
• More dynamic config, autodiscovery,
wildcards
Redfish collectd plugin
Config / Ctx
- Queries path defs
- Services:
- Endpoint
- Queries
Libredfish queues
Redfish API
[PODM, PSME, ...]
request
Parse json
dispatch
Notas do Editor
Sunku
At a very high level, Service Assurance includes 3 key elements – Monitoring, Presentation and Provisioning.
Monitoring includes getting the insight into various platform & network counters to correlate against established KPIs
The monitoring interfaces need to be able to integrate with both legacy and next gen management/controller systems while providing open standard interfaces
Presentation is the ability of reporting the metrics to take relevant action on service level changes.
Traditionally this is done by human intervention to address failures/violation but with the amount of data available for processing the goal is to move towards dynamic resolution based on the configured parameters.
Provisioning involves idea of reconfiguring platform resources to meet the service level objectives in a service level agreement.
Today’s presentation will focus first two elements
Phase1: Make sure that the relevant telemetry is instrumented where appropriate and that telemetry is exposed through collected to existing management systems.
Basically the goal of phase 1 is to interwork with existing management systems.
Phase 2: moving to NFV deployment world, we want to make sure that the virtual infrastructure as well as the management and orchestration layer can understand the telemetry and events that it is receiving so that it can take appropriate action based on that telemetry
Phase 3: is where we want to integrate with Ml to allow us to make more intelligent placement decisions and adjustments to scheduling policy.
More importantly to correlate NFVI failures with VNF performance issues to be able to predict failures to automatically adapt our environment as required based on the current state of the platform and the telemetry of the different sub systems.
Phase 1 will continue forever: as in the ingredient teams will be always upstreaming new features to collectd.
Phase 2 and Phase 3 is what we are currently implementing
John
When we started out we would deploy collectd directly on the platform.
This caused issues as every system is different.
If we were dealing with customers we would have no control over what they had installed on their system.
Within the barometer project we decided to have a SA package.
This included Collect influx and graphana containers.
This allows the customer to deploy with ease.
We then added ansible scripts to make the deployment process a one line command.
In short a new customer should be able to take our package, deploy it, and see platform metrics on a web browser within 10 minutes.
John
When we started out we would deploy collectd directly on the platform.
This caused issues as every system is different.
If we were dealing with customers we would have no control over what they had installed on their system.
At the same time the opnfv community was shifting towards containers.
For this reason we decided to containerise collectd.
This gave us multiple benefits.
We could insure that collectd would be installed on the same environment.
Its much quicker to pull the collectd container instead of building and installing it on your system.
None of the software within the container would interfere with what is already installed on the system.
When we containerised collectd we also containerise other nessary components.
Once we get the metrics from collectd we need to save them for which we used influxdb.
In an effort to provide end to end
When we containerised collectd we also decided to containerise other important components.
These included:
Influx which we use as our time series database and
Grafana which we use to graph the metrics within influx
After this when we started talking to customers we found a few issues.
Customers were unfamiliar with containers and need help setting them up.
Deploying at scale was still slow.
For this reason we created ansible scripts to deploy and configure the containers.
Providing Ansible support has made it easy for us to introduce SA to various customers.
Idea is to provide one click install that installs & configures necessary containers like Collectd, Influxdb, Grapahana, Kafka as a message bus VNF Event Stream (VES) containers.
We made it easy to install, configure and visualize various NFVi metrics at scale.
Various customers are currently using them and playing with it.
On the plugins, we have tight collaboration with NFVi BKC that tests the plugins across combinations of various operating systems, platform generations, NICs, etc. This ensures our plugins are up to date with IA platforms.
On automation front, we have good integration with opnfv ci functest to ensure barometer deploys with opnfv installers without regression. Barometer currently supports Apex and we will be adding Compass support.
The downside of Collectd community is that there is no established cadence of releases or merging of pull requests.
In order to provide early showcase IA features through Barometer we came up with 3 container approach:
Stable container provides latest stable branch of Collectd that has fully tested and validated plugins
Master container provides plugins & bug fixes that are on master and not yet in a release
Experimental container provides latest and greatest of plugins by cherrypicking the newest pull requests instead of waiting for making onto master.
This way we provide early access to telemetry feature set.
On downstream, we have strong engineer to engineer partnership with RH on OSP to ensure IA specific plugins make it into OSP releases.
Sunku
The basic goal of ML is to achieve close loop automation using IA metrics
NSB is used to generate the data for the ML analysis
We have had a few ML teams engaged and they have been able to correlate multiple IA metrics to packet loss on an EPC
We are still in the early stages of ML.
KK
Prometheus is a project originally from SoundCloud, then it has been moved to Cloud Native Computing Foundation (CNCF), as second hosted project, right after kubernetes.
This is an is an open-source monitoring system, which has few interesting features, like
a dimensional data model
flexible query language
built-in efficient time series database
and modern alerting approach
The integration with collectd is already in place.
We have performed verification how it can be integrated with two ways:
One is collectd native plugin, which is serving http endpoint from where Prometheus server can scrap latest metrics.
And the second one is Prometheus collectd exporter which acts like a proxy, where collectd is writing data with its network plugin and Prometheus server can get data from there.
What is worth to highlight here is that Prometheus is working in pulling model, rather then standard collectd pushing of metrics. So a bit different architecture.
Prometheus is part of many solutions, like
NGCO, on which Sunku will tell you more in a moment
RH OSP SA framework
Or also as a proposal in ONAP to integrate with OOF in the edge area
And there is also container incoming to be part of barometer collection.
Bottom line is that all the platform stats can be pulled into Prometheus today without any additional development
KK
Open Network Automation Platform is a project under Linux Foundation Networking governance.
By unifying member resources, ONAP is accelerating the development of a vibrant ecosystem around a globally shared architecture and implementation for network automation, with focus on open standards.
It „provides a comprehensive platform for real-time, policy-driven orchestration and automation of physical and virtual network functions that will enable software, network, IT, cloud providers and developers to rapidly automate new services and support complete lifecycle management.”
Our goal is to enable closed loop automation, currently by engaging on VES and EPA/HPA projects.
KK
So VES or VNF Event Stream is a project which goal is to enable significant reduction in effort required to develop and integrate VNF telemetry-related data into automated VNF management systems, by promoting convergence to a common event stream format and collection system.
In current implementation collectd is sending metrics to kafka bus, from there VES application is picking them up, unifying with given schema and sending to VES collector (which is part of DCAE).
It is choosen solution in the core area.
KK
KK
KK
What is PNDA? „The scalable, open source big data analytics platform for networks and services”
PNDA similar to OPNF or ONAP is a project under Linux Foundation Networking
In terms of functionality
PNDA aggregates data like logs, metrics and network telemetry
Scales up to consume millions of messages per second
Efficiently distributes data with publish and subscribe model
Processes bulk data in batches, or streaming data in real-time
Manages lifecycle of applications that process and analyse data
Lets developers to have insight and explore data using interactive notebooks
Entry point for collectd is Kafka bus, there are few ways to ingest data for consumption by analytic apps.
In raw json format directly with write_kafka plugin
Or with network plugin through logstash with raw json or avro formatting
We have performed verification it works properly with RedPNDA and also working with CISCO as a partner for better customer oriented use cases for analytics.
PNDA can integrate with MANO systems supporting closed loop automation on the east/west level.
And there is proposal about integrating PNDA in DCAE (part of ONAP) by replacing CDAP with similar functionality.
KK
AR: change color scheme / bigger font
Direct ingestion with write_kafka could have better performance
But preferred format in PNDA is AVRO, this is data serialization framework with given schema in json.
(if remember correctly) check was with generic avro serializer, not pnda one, so additional field mutae may be required, which should not be case with pnda-avro codec.
They were same by the time check was made, with different single step of base64 encoding, which was the issue on decoding messages with given example of consumer in redpnda.
PNDA AVRO schema
{
"namespace": "pnda.entity",
"type": "record",
"name": "event",
"fields": [
{"name": "timestamp", "type": "long"}, // time when was generated/ingested by pnda
{"name": "src", "type": "string"}, // e.g. collectd
{"name": "host_ip", "type": "string"}, // source of host where was generated
{"name": "rawdata", "type": "bytes"} // rest of raw data
]
}
Looking to help these with IA features. Feature exposure, provisioning & telemetry. We are looking to enable/fill in these gaps.