adapters

Adapters are the way you can get data in and out of the flume pipeline. There are a set of built in ones that you can read about in the following sections and you can also register your own adapter which we'll dive into how the current API for that works as well below.

built-in

write your own

Writing your own adapter is encouraged as there's no way we'd ever have an adapter for every possible storage at this point in the development of flume. We also look forward to accepting any adapters written into the core of flume as to better share with the community.

To write your own adapter you need to extend from the class flume.adapters.adapter.adapter and then implement the methods:

Now that you've written your adapter you can register it using the register_streamer(cls) which will register your adapter with the name specified by the class level attribute name.

optimizations

Optimizations are the way that flume takes a pipeline of operations and figures out which ones can be executed on the backend (closer to where the data resides) so that you don't have to transport data from its source to where the flume pipeline is executing in order to compute something on that data or make a decision about it. With that said the optimize method on the adapter allows you to see what is downstream from the adapter before the adapter starts running.

As an example the simplest thing to optimize in an adapter is when the following proc is the head proc then you can simply set a limit internally in your adapter and stop reading data once you've reached the desired first N values. This may not seem like much but can save a lot of time if you were to only need the top 100 values but had 1000 of them to read from a backend and would simply stop much earlier pulling irrelevant points from the backend.

Have a look at how we implemented the optimize method for elastic and stdio to get a simple view of how optimization can be done.