Computing a Rolling Median¶

This describes how to code up a rolling median using a SkipList in both C++ and Python. A rolling median operation means, for a SkipList, an insert(new_value) then at(middle) then remove(old_value). The number of operations to calculate a rolling median is the data length minus the window length.

The same approach can be used for a rolling percentile.

Rolling Median in C++¶

Here is a reasonable C++ attempt at doing that with the arguments:

data - A vector of data of type T of length L.
win_length - a ‘window’ size. The median is computed over this number of values.
result - a destination vector for the result. This will either end up with L - win_length values.

#include "SkipList.h"

template<typename T>
RollingMedianResult rolling_median(const std::vector<T> data,
                                   size_t win_length,
                                   std::vector<T> &result) {
    if (win_length == 0) {
        return ROLLING_MEDIAN_WIN_LENGTH;
    }
    OrderedStructs::SkipList::HeadNode<T> sl;

    result.clear();
    std::vector<T> buffer;
    for (size_t i = 0; i < data.size(); ++i) {
        sl.insert(data[i]);
        if (i >= win_length) {
            if (win_length % 2 == 1) {
                result.push_back(sl.at(win_length / 2));
            } else {
                /* Even length so average */
                sl.at((win_length - 1) / 2, 2, buffer);
                assert(buffer.size() == 2);
                result.push_back(buffer[0] / 2 + buffer[1] / 2);
            }
            sl.remove(data[i - win_length]);
        }
    }
    return ROLLING_MEDIAN_SUCCESS;
}

A full example is the RollingMedian::rolling_median_lower_bound function in RollingMedian.h.

If you are working with C arrays (such as Numpy arrays) then this C’ish approach might be better, again error checking omitted:

#include "SkipList.h"

template <typename T>
void rolling_median(const T *src, size_t count, size_t win_length, T *dest) {

    OrderedStructs::SkipList::HeadNode<T> sl;
    const T *tail = src;

    for (size_t i = 0; i < count; ++i) {
        sl.insert(*src);
        if (i + 1 >= win_length) {
            *dest = sl.at(win_length / 2);
            ++dest;
            sl.remove(*tail);
            ++tail;
        }
        ++src;
    }
}

Multidimensional Numpy arrays have a stride value which is omitted in the above code but is simple to add. See RollingMedian.h and test/test_rolling_median.cpp for further examples.

Rolling percentiles require a argument that says what fraction of the window the required value lies. Again, this is easy to add.

Even Window Length¶

The above code assumes that if the window length is even that the median is at (window length - 1) / 2. A more plausible median for even sized window lengths is the mean of (window length - 1) / 2 and window length / 2.

This requires that the mean of two types is meaningful which it will not be for strings. In that case you will get this compilation error:

RollingMedian.h:91:52: error: invalid operands to binary expression ('value_type' (aka 'std::string') and 'int')
            result.push_back(buffer[0] / 2 + buffer[1] / 2);
                             ~~~~~~~~~ ^ ~

One remedy for this is to use the RollingMedian::rolling_median_lower_bound function. This always uses the lower bound so works correctly for odd sized window lengths. For even sized window lengths this chooses the lower value rather than averaging two values. This is useful for, say, strings that can not be averaged.

C++ Performance¶

Here is a plot of the time taken to compute a rolling median on one million values using different window sizes. The time here is in ns/result and the number of results is 1e6 - window size. Given a data length of 1m then a window length of 1000 this would mean 999,000 operations. A window length of 500,000 this would mean 500,000 operations.

A window size of 1000 and 1m values (the size of the SkipList) takes around 750 ns/value or 0.75 second in total.

The test function is perf_roll_med_by_win_size() in src/cpp/test/test_performance.cpp.

Rolling Median in Python¶

Here is an example of computing a rolling median of a numpy 1D array. This creates an array with the same length as the input starting with window_length NaN s:

import numpy as np

import orderedstructs

def simple_python_rolling_median(vector: np.ndarray,
                                 window_length: int) -> np.ndarray:
    """Computes a rolling median of a numpy vector returning a new numpy
    vector of the same length.
    NaNs in the input are not handled but a ValueError will be raised."""
    if vector.ndim != 1:
        raise ValueError(
            f'vector must be one dimensional not shape {vector.shape}'
        )
    skip_list = orderedstructs.SkipList(float)
    ret = np.empty_like(vector)
    for i in range(len(vector)):
        value = vector[i]
        skip_list.insert(value)
        if i >= window_length - 1:
            # // 4 for lower quartile
            # * 3 // 4 for upper quartile etc.
            median = skip_list.at(window_length // 2)
            skip_list.remove(vector[i - window_length + 1])
        else:
            median = np.nan
        ret[i] = median
    return ret

This can be called thus:

np_array = np.arange(10.0)
print('Original:', np_array)
result = simple_python_rolling_median(np_array, 3)
print('  Result:', result)

And the result will be:

Original: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
  Result: [nan nan  1.  2.  3.  4.  5.  6.  7.  8.]

Of course this Python code could be made much faster by using a Python C Extension.

Python Performance¶

Here is a plot of the time taken to compute a rolling median on one million values using different window sizes. The time here is in ns/result and the number of results is 1e6 - window size. Given a data length of 1m then a window length of 1000 this would mean 999,000 operations. A window length of 500,000 this would mean 500,000 operations.

A window size of 1000 and 1m values (the size of the SkipList) takes around 750 ns/value or 0.75 second in total.

Python Rolling Median by Window Size Performance

Performance Comparison of C++ and Python¶

Here is the C++ performance plotted along with the Python performance.

C++ and Python Rolling Median by Window Size Performance

As expected C++ is around 2x faster.

But Python has another trick up its sleeve that can make it outperform C++ decisively; multiprocessing with shared memory.

Rolling Median in Python with `multiprocessing.shared_memory`¶

An exiting development in Python 3.8+ is multiprocessing.shared_memory This allows a parent process to share memory with its child processes.

In this example we are going to compute a rolling median on a 2D numpy array where each child process works on a single column of the same array and writes the result to a shared output array. There will be two shared memory areas; a read one with the input data and a write one with the result from all the child processes There will be two corresponding numpy arrays the input that we are given and the output numpy array that we create.

The only copying going on here is the initial copy of the input array into shared memory and then the final copy, when all child processes have completed of that shared memory to a single numpy array.

Pictorially:

Parent                                        Children
======                                        ========
Copies the numpy array to the input SharedMemory
Creates the output SharedMemory
Launches n child processes...
\>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\
                                              Work on part of the input SharedMemory
                                              Write to the output SharedMemory
                                              ...
/<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<</
When all child processes complete...
Copies output SharedMemory to a new numpy array
Releases both SharedMemory resources.

Note

This solution assumes that you are given a numpy array that you need to process. An alternative solution is to create a shared memory object, create an empty numpy array that uses the shared memory buffer, populate that buffer and pass the buffer to the child processes. This would save the cost, in time and memory, of the first copy operation.

Code¶

The exemplar code is in tests/benchmarks/roll_med_sh_mem.py

Note

The reason for the odd code location is the way that the orderedstructs package is, for historical reasons, constructed entirely in C. See PyInit_orderedstructs() in src/cpy/cOrderedStructs.cpp.

This means that it is not possible to combine pure Python and C code in the orderedstructs package. Specifically orderedstructs does not have a __path__ attribute that allows it to import pure Python modules from that path.

In future this might change to be more like, say, pymemtrace that is able to mix C and pure Python code in the same package.

These are the essential imports and a utility function:

import multiprocessing
# Python 3.8+, need to be specific about importing this.
from multiprocessing import shared_memory
import typing

import numpy as np

import orderedstructs

def np_data_pointer(a: np.ndarray) -> int:
    """The address of the actual data. 'data pointer' in np.info()."""
    return a.__array_interface__["data"][0]

Firstly here is the rolling median function that will be used by the child processes, it works on a specific column of a numpy array. Error handling and logging are omitted for clarity:

def rolling_median_of_column(read_array: np.ndarray,
                             window_length: int, column_index: int,
                             write_array: np.ndarray) -> None:
    """Computes a rolling median of given column and writes out the
    results to the write array.

    This is called by a child process.

    This fills the initial column values with NaN where there is
    not enough data for a rolling median of window_length.
    """
    skip_list = orderedstructs.SkipList(float)
    for i in range(len(read_array)):
        value = read_array[i, column_index]
        skip_list.insert(value)
        if i >= window_length:
            median = skip_list.at(window_length // 2)
            skip_list.remove(read_array[i - window_length, column_index])
        else:
            median = np.nan
        write_array[i, column_index] = median

Now some code that wraps the low level multiprocessing.shared_memory.SharedMemory class that can be used by the parent process. This is a dataclass that records essential information about the array and includes the SharedMemory object itself. We will call it an SharedMemoryArraySpecification, it is pretty simple, just a dataclass (some unimportant functions omitted here):

from multiprocessing import shared_memory

@dataclasses.dataclass
class SharedMemoryArraySpecification:
    """A PoD class that contains the data needed for
    managing a shared_memory.SharedMemory.

    Typical usage, given ``arr: np.ndarray``::

        from multiprocessing import shared_memory

        shm = shared_memory.SharedMemory(create=True, size=arr.nbytes)
        array_spec = SharedMemoryArraySpecification(
            arr.shape, arr.dtype, arr.nbytes, shm,
        )
    """
    shape: typing.Tuple[int, ...]
    dtype: np.dtype
    nbytes: int
    shm: shared_memory.SharedMemory

    @property
    def name(self) -> str:
        return self.shm.name

    def close(self) -> None:
        """Close the file descriptor/handle to the shared memory
        from this instance."""
        self.shm.close()

    def close_and_unlink(self) -> None:
        """Delete the underlying shared memory block.
        This should be called only once per shared memory block
        regardless of the number of handles to it, even in other
        processes."""
        self.close()
        self.shm.unlink()

Now a function that takes a numpy array and uses shared_memory.SharedMemory to return a SharedMemoryArraySpecification. Optionally this copies the numpy array into the shared memory for the read array. We write this as a context manager. Error handling and logging are omitted for clarity:

@contextlib.contextmanager
def create_shared_memory_array_spec_close_unlink(
        arr: np.ndarray,
        copy_array: bool,
) -> SharedMemoryArraySpecification:
    """Context manager that creates a Shared Memory instance and,
    optionally, copies the numpy array into it.
    The Shared Memory instance is closed and unlinked on exit."""
    shm = shared_memory.SharedMemory(create=True, size=arr.nbytes)
    array_spec = SharedMemoryArraySpecification(
        arr.shape, arr.dtype, arr.nbytes, shm
    )
    try:
        if copy_array:
            # Copy the numpy array into shared memory.
            array_view = np.ndarray(
                array_spec.shape,
                dtype=array_spec.dtype, buffer=array_spec.shm.buf
            )
            array_view[:] = arr[:]
        yield array_spec
    finally:
        array_spec.close_and_unlink()

Next is a context manager that will wrap a numpy array around a SharedMemoryArraySpecification On exit this automatically releases the reference to the shared memory from the child process (but does not unlink that memory):

@contextlib.contextmanager
def recover_array_from_shared_memory_and_close(
        array_spec: SharedMemoryArraySpecification,
) -> np.ndarray:
    """A context manager as a wrapper around a SharedMemoryArraySpecification.
    This yields a view on the numpy input or output array and ensures that the
    shared memory is closed on __exit__."""
    array_shm = shared_memory.SharedMemory(name=array_spec.name)
    array_view = np.ndarray(
        array_spec.shape, array_spec.dtype, buffer=array_shm.buf
    )
    try:
        yield array_view
    finally:
        array_shm.close()

And use it in the child process, once for reading and once for writing. Error handling and logging are omitted for clarity:

def compute_rolling_median_2d_from_index(
        read_spec: SharedMemoryArraySpecification,
        window_length: int, column_index: int,
        write_spec: SharedMemoryArraySpecification,
) -> int:
    """Computes a rolling median of the 2D read array and
    window length and writes it to the 2D write array.

    This function is passed to multiprocessing to be invoked
    by the child process.
    """
    with recover_array_from_shared_memory_and_close(read_spec) as read_arr:
        with recover_array_from_shared_memory_and_close(write_spec) as write_arr:
            write_count = rolling_median_of_column(
                read_arr, window_length, column_index, write_arr
            )
    return write_count

Finally a function to copy the output shared memory to a new numpy array:

def copy_shared_memory_into_new_numpy_array(
        write_array_spec: SharedMemoryArraySpecification,
) -> np.ndarray:
    """With the output SharedMemoryArraySpecification
    this creates a new numpy array and copies the shared memory into it."""
    temp_write = np.ndarray(
        write_array_spec.shape,
        dtype=write_array_spec.dtype,
        buffer=write_array_spec.shm.buf
    )
    write_array = np.empty(
        write_array_spec.shape,
        dtype=write_array_spec.dtype,
    )
    write_array[:] = temp_write[:]
    return write_array

Finally here is the code for the parent process that puts this all together. Error handling and logging are omitted for clarity:

def compute_rolling_median_2d_mp(
        read_array: np.ndarray,
        window_length: int, num_processes: int,
) -> np.ndarray:
    """Compute a rolling median of a numpy 2D array using
    multiprocessing and shared memory.
    This is the top level call and is run within the parent process.

    This returns a np.ndarray of the same shape as the input with the
    rolling medians.
    Rows of this up to the window_length will be NaN.
    """
    # Limit number of processes if the number of columns is small.
    if read_array.shape[1] < num_processes:
        num_processes = read_array.shape[1]
    with create_shared_memory_array_spec_close_unlink(
            read_array, copy_array=True,
    ) as read_array_spec:
        with create_shared_memory_array_spec_close_unlink(
                read_array, copy_array=False,
        ) as write_array_spec:
            # Create the tasks.
            tasks = []
            for column_index in range(read_array.shape[1]):
                tasks.append(
                    (read_array_spec, window_length, column_index, write_array_spec)
                )
            # Create the pool and apply the tasks.
            mp_pool = multiprocessing.Pool(processes=num_processes)
            pool_apply = [
                mp_pool.apply_async(
                    compute_rolling_median_2d_from_index, t
                ) for t in tasks
            ]
            _write_counts = [r.get() for r in pool_apply]
            write_array = copy_shared_memory_into_new_numpy_array(write_array_spec)
    return write_array

This is the function that we are going to time so it includes:

Copying the numpy array to shared memory.
Creating the output shared memory.
Computing the rolling median with child processes.
Copying the output shared memory to a new numpy array.
Disposing of any temporaries.

Performance¶

Low Level Shared Memory Performance¶

First some low level performance benchmarks for multiprocessing.shared_memory. Here is the cost of creating an SharedMemory object with (for reading) and without (for writing) copying from a numpy array.

The non-copy cost is negligible. The copy cost is around 1,600 MB/s for 64bit floats.

And here is the cost of creating a new numpy array from the SharedMemory object:

The copy cost is around 1,600 MB/s for 64bit floats.

Performance on a Table of Floats¶

The table has 134,217,728 floats (1GB of data) and the tests are run with with different shapes. The rolling median window is 21. The platform was Mac OS X with 4 cores and hyper-threading.

Here is the total time to create the rolling median with this amount of data of different shapes for different number of processes.

The test code is in tests/benchmarks/test_benchmark_SkipList_rolling_median_sh_mem.py and is normally skipped as it can take up to six hours per Python version.

For reference 100 seconds is 745 nanoseconds per operation.

Here is comparison of different shapes for different number of processes compared with a single process.

Rolling Median with Shared Memory Relative Performance by Number of Processes

That is for the full 1GB array, but how much data is needed for shared memory to be an effective technique?

The following results are from the code in tests/unit/_test_rolling_median_shared_memory.py.

Columns: 16¶

In this test a 16 column array is created with up to 8,388,608 rows. This is up to 134,217,728 entries at 8 bytes a float this is 1,073,741,824 bytes (1GB) in total. Running this on 16 column arrays with 1m rows with processes from 1 to 16 gives the following execution times.

Comparing the speed of execution compared to a single process gives:

Rolling Median Relative Performance, 16 Columns

Clearly there is some overhead so it is not really worth doing this for less that 100,000 rows. The number of processes equal to the number of CPUs is optimum, twice that might give a small advantage.

Columns: 1024¶

In this test a 1024 column array is created with up to 131,072 rows. This is up to 134,217,728 entries at 8 bytes a float this is 1,073,741,824 bytes (1GB) in total. Running this on 1024 column arrays with up to 131,072 rows with processes from 1 to 16 gives the following execution times.

Rolling Median Performance, 1024 Columns

Comparing the speed of execution compared to a single process gives:

Rolling Median Relative Performance, 1024 Columns

Clearly there is some overhead so it is not really worth doing this for less that 1,000 rows or so.

Columns: 65536¶

In this test a 65,536 column array is created with up to 2048 rows. This is up to 134,217,728 entries at 8 bytes a float this is 1,073,741,824 bytes (1GB) in total. Running this on 65,536 column arrays with up to 2048 rows with processes from 1 to 16 gives the following execution times.

Rolling Median Performance, 65536 Columns

Comparing the speed of execution compared to a single process gives:

Rolling Median Relative Performance, 65536 Columns

The overhead, by number of columns is very low.

Comparison By Shape¶

Here is the results of the time to compute a rolling median with a single process and different number of columns and different array shapes by array size (number of rows):

Here is the results of the time to compute a rolling median with four processes and different number of columns and different array shapes by array size (number of rows):

Here is all the data plotted together for comparison:

The rate is, perhaps, more revealing:

Summary¶

For different table shapes using four simultaneous processes on a four CPU machine. The second column shows the number of rows need to get a 3x performance on a four CPU machine. The third column (“Best”) shows the maximum speedup.

Here is a summary of the performance gain:

Performance Gain¶
Columns	~Rows for 3x	Best
16	800,000	3.3x
1024	8,000	3.3x
65536	128	3.0x

The relative performance improvement between a single process and four processes is very consistent:

So, in summary, setting up shared memory really comes into its own with data sets of 1m+. It is not advised for less than 100,000 data points.

Here is a typical comparison between C++ and Python working on large data sets on a four CPU machine:

Python and C++¶
Environment	Time/op (ns)	Notes
C++, single process	500
Python, single process	900
Python, shared memory, processes=1	1000
Python, shared memory, processes=4	350

So Python can beat C++!

Memory Usage¶

What I would expect in processing a 100Mb numpy array. Values are in Mb.

Expected `shared_memory` Memory Usage With 100 Mb numpy array¶
Action	Memory Delta (Mb)	Total Memory (Mb)
Create a ‘read’ numpy array.	+100	100
Create a ‘read’ shared memory object.	+100	200
Copy the ‘read’ array into ‘read’ shared memory.	0 or very small.	200
Create a ‘write’ shared memory object.	+100	300
Calculate the rolling median and write into the ‘write’ shared memory object.	0 or very small.	300
Create an empty ‘write’ numpy array.	+100	400
Copy the ‘write’ shared memory into the ‘write’ numpy array.	0 or very small.	400
Unlink the ‘read’ shared memory	-100	300
Unlink the ‘write’ shared memory.	-100	200
Delete the ‘read’ numpy array when de-referenced.	-100	100
Delete the ‘write’ numpy array when de-referenced.	-100	Nominal.

Here are the actual results running on Mac OS X. Two things are noticeable:

Creating the shared memory object has no memory cost. It is only when data is copied into it that the memory is allocated and that is incremental.
The RSS shown here is collected from psutil and it looks like this is including shared memory so there may be double counting here. psutil can not identify shared memory on Mac OS X, it can on Linux.

Here is the breakdown of the RSS memory profile of processing a numpy array with 6m rows with 2 columns (100Mb) with a parent [P] and two child processes [0] and [1]. The change in RSS is indicated by “d” (if non-zero). Values are in Mb.

`shared_memory` Memory Usage With Two Processes¶
Action	P	dP	0	d0	1	d1	Notes
Parent start	30	+30					Normal Python executable.
Create numpy array	130	+100					Cost of creating a 100Mb numpy array.
Create read shared memory	130						No immediate memory cost.
Copy numpy array into shared memory	225	+95
Create write shared memory	225						No immediate memory cost.
Child start			23	+23	23	+23	Normal Python executable.
Rolling median start			23		23
Rolling median 25%			71	+48	71	+48	Incremental memory increase, similar to copy on write.
Rolling median 50%			119	+48	119	+48
Rolling median 75%			166	+47	166	+47
Rolling median complete			214	+48	214	+48	Peak figure, it looks like the RSS for the child processes is including both shared memory areas (192Mb).
Close write shared memory			119	-95	119	-95
Close read shared memory			23	-96	23	-96	Child process now using the normal memory for a Python process.
After child processes complete.	227	+2					Children have written to write shared memory which is now included in the parent memory RSS.
After creating empty numpy write array.	227						NOTE: Buffer is lazily allocated.
After writing write shared memory to numpy write array.	419	+192					Not sure why this twice what is expected (100Mb).
After unlink write array spec.	321	-98
After unlink read array spec.	226	-95					Discard read array shared memory. Numpy read and write arrays still exist, 100Mb each.
Parent process ends.	227						Read array and write array discarded. See note below.

Note

There is an interesting quirk here, the array is 6m rows with 2 columns and has a residual memory of 227Mb. This is not reduced by a gc.collect(). This does not increase if the same function calls are repeated. If the array is changed to 16m rows, 2 columns (260Mb) the residual memory is 35Mb, typical for a minimal Python process.

Handling NaNs¶

Not-a-number (NaN) values can not be inserted into a Skip List as they are not comparable to anything (including themselves). An attempt to call insert(), index(), has(), remove() with a NaN will raise an error. In C++ this will throw a OrderedStructs::SkipList::FailedComparison. In Python it will raise a ValueError. This section looks at handling NaNs in Python.

Here are several ways of handling NaNs:

Propogate the Exception.
Make the Median NaN.
Forward Filling.

Propogate the Exception¶

Here is a rolling median that will raise a ValueError if there is a NaN in the input.

def rolling_median_no_nan(vector: typing.List[float],
                          window_length: int) -> typing.List[float]:
    """Computes a rolling median of a vector of floats and returns the results.
    NaNs will throw an exception."""
    skip_list = orderedstructs.SkipList(float)
    ret: typing.List[float] = []
    for i in range(len(vector)):
        value = vector[i]
        skip_list.insert(float(value)) # This will raise a ValueError on NaN
        if i >= window_length:
            median = skip_list.at(window_length // 2)
            skip_list.remove(vector[i - window_length])
        else:
            median = math.nan
        ret.append(median)
    return ret

Make the Median NaN¶

Here is a rolling median that will make the median NaN if there is a NaN in the input. Incidentally this is the approach that numpy takes.

def rolling_median_with_nan(vector: typing.List[float],
                            window_length: int) -> typing.List[float]:
    """Computes a rolling median of a vector of floats and returns the results.
    NaNs will be consumed."""
    skip_list = orderedstructs.SkipList(float)
    ret: typing.List[float] = []
    for i in range(len(vector)):
        value = vector[i]
        if math.isnan(value):
            median = math.nan
        else:
            skip_list.insert(float(value))
            if i >= window_length:
                median = skip_list.at(window_length // 2)
                remove_value = vector[i - window_length]
                if not math.isnan(remove_value):
                    skip_list.remove(remove_value)
            else:
                median = math.nan
        ret.append(median)
    return ret

The first row is the input, the second the output. Window length is 5:

[0.0,      math.nan,      2.0,      3.0,      4.0, 5.0, 6.0, math.nan, 8.0, 9.0],
[math.nan, math.nan, math.nan, math.nan, math.nan, 3.0, 4.0, math.nan, 4.0, 5.0],

Forward Filling¶

Another approach is to replace the NaN with the previous value. This is very popular in FinTech and is commonly know as Forward Filling. Here is an implementation:

def forward_fill(vector: typing.List[float]) -> None:
    """Forward fills NaNs in-place."""
    previous_value = math.nan
    for i in range(len(vector)):
        value = vector[i]
        if math.isnan(value):
            vector[i] = previous_value
        if not math.isnan(value):
            previous_value = value

def rolling_median_with_nan_forward_fill(vector: typing.List[float],
                                         window_length: int) -> typing.List[float]:
    """Computes a rolling median of a vector of floats and returns the results.
    NaNs will be forward filled."""
    forward_fill(vector)
    return rolling_median_no_nan(vector, window_length)

The first row is the input, the second the output. Window length is 5:

[0.0,      math.nan,      2.0,      3.0,      4.0, 5.0, 6.0, math.nan, 8.0, 9.0],
[math.nan, math.nan, math.nan, math.nan, math.nan, 2.0, 3.0,      4.0, 5.0, 6.0],

Another example where [3] is now a NaN, the first row is the input, the second the output. Window length is 5:

[0.0,      math.nan,      2.0, math.nan,      4.0, 5.0, 6.0, math.nan, 8.0, 9.0],
[math.nan, math.nan, math.nan, math.nan, math.nan, 2.0, 2.0,      4.0, 5.0, 6.0],

There is no ‘right way’ to handle NaNs. They are always problematic. For example what is the ‘right way’ to sort a sequence of values that may include NaNs?

Computing a Rolling Median¶

Rolling Median in C++¶

Even Window Length¶

C++ Performance¶

Rolling Median in Python¶

Python Performance¶

Performance Comparison of C++ and Python¶

Rolling Median in Python with `multiprocessing.shared_memory`¶

Code¶

Performance¶

Low Level Shared Memory Performance¶

Performance on a Table of Floats¶

Columns: 16¶

Columns: 1024¶

Columns: 65536¶

Comparison By Shape¶

Summary¶

Memory Usage¶

Handling NaNs¶

Propogate the Exception¶

Make the Median NaN¶

Forward Filling¶

Table of Contents

Previous topic

Next topic

This Page

Computing a Rolling Median¶

Rolling Median in C++¶

Even Window Length¶

C++ Performance¶

Rolling Median in Python¶

Python Performance¶

Performance Comparison of C++ and Python¶

Rolling Median in Python with multiprocessing.shared_memory¶

Code¶

Performance¶

Low Level Shared Memory Performance¶

Performance on a Table of Floats¶

Columns: 16¶

Columns: 1024¶

Columns: 65536¶

Comparison By Shape¶

Summary¶

Memory Usage¶

Handling NaNs¶

Propogate the Exception¶

Make the Median NaN¶

Forward Filling¶

Rolling Median in Python with `multiprocessing.shared_memory`¶