Installation¶
Requirements¶
- Python 3.11+
- Apache Kafka
- MongoDB
- Docker (optional, for development)
Install FlowKit¶
From PyPI¶
From Source¶
Development Installation¶
For development with all dependencies:
Infrastructure Setup¶
Apache Kafka¶
FlowKit requires Kafka for message coordination. You can run Kafka locally using Docker:
# Start Zookeeper and Kafka
docker run -d --name zookeeper -p 2181:2181 zookeeper:3.8
docker run -d --name kafka -p 9092:9092 \
--link zookeeper:zookeeper \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
confluentinc/cp-kafka:latest
Or use the provided Docker Compose file:
MongoDB¶
FlowKit uses MongoDB for task state and artifact persistence:
Or with Docker Compose:
Configuration¶
Create configuration files for your coordinator and workers:
Coordinator Configuration (coordinator.json)¶
{
"kafka_bootstrap": "localhost:9092",
"worker_types": ["indexer", "processor", "analyzer"],
"heartbeat_soft_sec": 300,
"heartbeat_hard_sec": 3600,
"lease_ttl_sec": 60
}
Worker Configuration (worker.json)¶
{
"kafka_bootstrap": "localhost:9092",
"roles": ["processor"],
"worker_id": null,
"lease_ttl_sec": 60,
"hb_interval_sec": 20
}
Verification¶
Test your installation: