E-commerce giant eBay has released an open-source, real-time analytics platform and stream processing framework, dubbed Pulsar. Through Pulsar, user and business events will be collected and processed in real time, enabling superior interaction, to the tune of a million events per second with high availability.
Owing to the ever increasing buyer and seller traffic eBay faces, newer use cases are generated that need collection and processing in real-time. So far, user experience optimization and behaviour analysis was carried out using batch-oriented data platforms like Hadoop.
To enable better interaction, “derive actionable insights and generate signals for immediate action” eBay decided to develop their own distributed CEP framework. Pulsar CEP provides a Java-based framework as well as tooling to build, deploy, and manage CEP applications in a cloud environment, explains the blog post making the announcement.
The post points out that Pulsar CEP includes the following capabilities:
- Declarative definition of processing logic in SQL
- Hot deployment of SQL without restarting applications
- Annotation plugin framework to extend SQL functionality
- Pipeline flow routing using SQL
- Dynamic creation of stream affinity using SQL
- Declarative pipeline stitching using Spring IOC, thereby enabling dynamic topology changes at runtime
- Clustering with elastic scaling
- Cloud deployment
- Publish-subscribe messaging with both push and pull models
- Additional CEP capabilities through Esper integration
Built within Pulsar is a real-time analytics data pipeline that provides higher reliability and scalability through processes. such as data enrichment, filtering and mutation,aggregation and stateful processing.
The now open-sourced Pulsar has been deployed in production at eBay and is processing all user behavior events, the post says. A dashboard is under development which will make integration with metrics stores like Cassandra and Druid, easier.
(Image credit: Justin Sullivan, Getty Images)