The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Apache Spark - A unified analytics engine for large-scale data processing
ClickHouse® is a real-time analytics DBMS
💫 Industrial-strength Natural Language Processing (NLP) in Python
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe,...
Apache Flink
An open source cybersecurity protocol for syncing decentralized graph data.
The official home of the Presto distributed SQL query engine for big data
大数据入门指南 :star:
An open source time-series database for fast ingest and SQL queries
The Data Engineering Cookbook
PredictionIO, a machine learning server for developers and ML engineers.
Apache Doris is an easy-to-use, high performance and unified analytics d...
CMAK is a tool for managing Apache Kafka clusters
A distributed, fast open-source graph database featuring horizontal sc...