# Fleets

Fleets in Zerve provide an easy-to-use parallel processing feature. By leveraging the `spread()` function, you can initialize parallel tasks with a simple Python command. The `spread()` function takes a list as input, fanning out multiple compute blocks corresponding to each element of the list. After execution, the `gather()` function collects all results into a list for subsequent processing.

### **ML Pipeline in Fleets:**

\
Parallel processing in Zerve's fleets can significantly enhance the efficiency of building machine learning (ML) pipelines. Once data processing is complete, multiple algorithms can be applied simultaneously to different subsets of data. This approach allows data scientists to explore various ML models in parallel, quickly identifying which algorithms yield the best results for a given dataset.

Hyperparameter tuning, an essential step in optimizing ML models, can also benefit greatly from the parallel processing capabilities of fleets. By running hyperparameter tuning in fleets, multiple combinations of parameters can be tested simultaneously. This not only speeds up the process but also helps in discovering the most effective configuration to enhance model performance across varied datasets.

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FVf7d8dOHYKd9hpVBltDH%2Fimage.png?alt=media&#x26;token=2f972b2b-f0ca-40fc-a18f-6126dce9bdc5" alt=""><figcaption><p>Setup XGBoost Parameters</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2Fad5LR5MxfyXV4ReZ7InT%2Fimage.png?alt=media&#x26;token=03ea0fbb-7177-498f-95af-8c60826e5038" alt=""><figcaption><p>Pass Parameters to Spread Function</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FMDsYx7ECO6URcrlTWBvJ%2Fimage.png?alt=media&#x26;token=a81cf470-b68b-47dc-b6f7-0e7bbe7dc411" alt=""><figcaption><p>72 Concurrent Runs Initialized - One for each Hyper Parameter Combination</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FTXQm3GpkPobpcJtmIshH%2Fimage.png?alt=media&#x26;token=be20c260-3e96-4750-a078-2f5ba92073c7" alt=""><figcaption><p>Use GATHER function to combine all results in Aggregator Block</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FGykdQhvZKH6p5Yz0oI7Z%2Fimage.png?alt=media&#x26;token=b0fe768c-e985-4239-9b09-5df4662deb9d" alt=""><figcaption><p>Retrain with Best Parameter Combination</p></figcaption></figure>

### Data Processing with Fleets:

Fleets in Zerve also enable faster data processing by allowing category-level processing within a dataset column. For instance, when dealing with large datasets that contain categorical variables, fleets can process each category in parallel. This strategy reduces processing time substantially, enabling quicker insights and data-driven decisions.

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FgtbWhW40p3stKewlIygs%2Fimage.png?alt=media&#x26;token=d588ba7e-5aba-45d8-ac1d-d263340b7c61" alt=""><figcaption><p>Setup Airline list to Query Snowflake</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FO02yS3b4NwwYVOaM9NOm%2Fimage.png?alt=media&#x26;token=1a0d26bc-54db-4664-a23e-6ce5dc11c9c4" alt=""><figcaption><p>Query/Process Snowflake code from 18 concurrent blocks</p></figcaption></figure>

### LLM Evaluation with Fleets:

Furthermore, testing large language models (LLMs) with multiple prompts across different models can become significantly more efficient using fleets. By distributing the workload across different compute blocks, teams can evaluate LLMs against a variety of inputs in a fraction of the time it would take to perform sequential testing. This parallel approach not only saves time but also provides a more comprehensive understanding of how different models perform under diverse conditions.

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2F61Vq45frfKBEXK8iWV45%2Fimage.png?alt=media&#x26;token=dd4c8949-76f9-426b-b01c-964973e477ac" alt=""><figcaption><p>Create a list of Product Rating Descriptions</p></figcaption></figure>

<figure><img src="https://1018070783-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIQKNeqjEeOp9UwUcB9R9%2Fuploads%2FLfe6Px1hPLUJsVIGG0dY%2Fimage.png?alt=media&#x26;token=d0618e96-ff82-4fb3-b2c1-4f53a16f7404" alt=""><figcaption><p>Pass it to 100 concurrent LLM requests for processing</p></figcaption></figure>
